Overview

Brought to you by YData

Dataset statistics

Number of variables139
Number of observations455212
Missing cells33080502
Missing cells (%)52.3%
Total size in memory482.7 MiB
Average record size in memory1.1 KiB

Variable types

Text139

Dataset

DescriptionFish NMNH Extant Specimen Records 0055081-241126133413365
URLhttps://doi.org/10.15468/dl.34mb2x

Alerts

license has constant value "CC0_1_0" Constant
publisher has constant value "National Museum of Natural History, Smithsonian Institution" Constant
institutionID has constant value "urn:lsid:biocol.org:col:34871" Constant
collectionID has constant value "urn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f" Constant
institutionCode has constant value "USNM" Constant
collectionCode has constant value "FISH" Constant
datasetName has constant value "NMNH Extant Biology" Constant
sex has constant value "MALE" Constant
eventID has constant value "941.0" Constant
minimumDistanceAboveSurfaceInMeters has constant value "Williams, Jeffrey T." Constant
earliestEraOrLowestErathem has constant value "Animalia" Constant
latestEraOrHighestErathem has constant value "Chordata" Constant
verbatimIdentification has constant value "SPECIES" Constant
identifiedByID has constant value "ACCEPTED" Constant
identificationVerificationStatus has constant value "821cc27a-e3bb-4bc5-ac34-89ada245069d" Constant
identificationRemarks has constant value "US" Constant
taxonConceptID has constant value "StillImage" Constant
acceptedNameUsage has constant value "false" Constant
nameAccordingTo has constant value "1" Constant
namePublishedIn has constant value "44" Constant
subtribe has constant value "EML" Constant
nomenclaturalStatus has constant value "PHL.36.21_1" Constant
taxonRemarks has constant value "Iloilo City" Constant
protocol has constant value "EML" Constant
lastCrawled has constant value "2024-12-02T11:48:23.416Z" Constant
publishedByGbifRegion has constant value "NORTH_AMERICA" Constant
recordNumber has 434386 (95.4%) missing values Missing
recordedBy has 287312 (63.1%) missing values Missing
sex has 455209 (> 99.9%) missing values Missing
preparations has 346184 (76.0%) missing values Missing
associatedSequences has 454762 (99.9%) missing values Missing
occurrenceRemarks has 290485 (63.8%) missing values Missing
verbatimLabel has 455209 (> 99.9%) missing values Missing
materialSampleID has 455209 (> 99.9%) missing values Missing
eventID has 455211 (> 99.9%) missing values Missing
fieldNumber has 274211 (60.2%) missing values Missing
eventDate has 60241 (13.2%) missing values Missing
startDayOfYear has 91500 (20.1%) missing values Missing
endDayOfYear has 91500 (20.1%) missing values Missing
year has 60500 (13.3%) missing values Missing
month has 82757 (18.2%) missing values Missing
day has 108703 (23.9%) missing values Missing
verbatimEventDate has 92472 (20.3%) missing values Missing
locationID has 352012 (77.3%) missing values Missing
higherGeography has 20492 (4.5%) missing values Missing
continent has 162647 (35.7%) missing values Missing
waterBody has 133275 (29.3%) missing values Missing
islandGroup has 390811 (85.9%) missing values Missing
island has 270596 (59.4%) missing values Missing
countryCode has 30434 (6.7%) missing values Missing
stateProvince has 174301 (38.3%) missing values Missing
county has 357533 (78.5%) missing values Missing
locality has 45084 (9.9%) missing values Missing
verbatimElevation has 453008 (99.5%) missing values Missing
verbatimDepth has 446636 (98.1%) missing values Missing
minimumDistanceAboveSurfaceInMeters has 455211 (> 99.9%) missing values Missing
decimalLatitude has 254257 (55.9%) missing values Missing
decimalLongitude has 254257 (55.9%) missing values Missing
coordinateUncertaintyInMeters has 450059 (98.9%) missing values Missing
pointRadiusSpatialFit has 455205 (> 99.9%) missing values Missing
verbatimCoordinateSystem has 308939 (67.9%) missing values Missing
georeferencedBy has 455205 (> 99.9%) missing values Missing
georeferenceProtocol has 437832 (96.2%) missing values Missing
georeferenceRemarks has 432197 (94.9%) missing values Missing
latestEonOrHighestEonothem has 455205 (> 99.9%) missing values Missing
earliestEraOrLowestErathem has 455205 (> 99.9%) missing values Missing
latestEraOrHighestErathem has 455205 (> 99.9%) missing values Missing
latestPeriodOrHighestSystem has 455205 (> 99.9%) missing values Missing
latestEpochOrHighestSeries has 455205 (> 99.9%) missing values Missing
highestBiostratigraphicZone has 455205 (> 99.9%) missing values Missing
lithostratigraphicTerms has 455205 (> 99.9%) missing values Missing
member has 455205 (> 99.9%) missing values Missing
verbatimIdentification has 455205 (> 99.9%) missing values Missing
identificationQualifier has 453516 (99.6%) missing values Missing
typeStatus has 436448 (95.9%) missing values Missing
identifiedBy has 421073 (92.5%) missing values Missing
identifiedByID has 455205 (> 99.9%) missing values Missing
identificationVerificationStatus has 455205 (> 99.9%) missing values Missing
identificationRemarks has 455205 (> 99.9%) missing values Missing
taxonID has 455205 (> 99.9%) missing values Missing
parentNameUsageID has 455209 (> 99.9%) missing values Missing
originalNameUsageID has 455209 (> 99.9%) missing values Missing
namePublishedInID has 455205 (> 99.9%) missing values Missing
taxonConceptID has 455210 (> 99.9%) missing values Missing
acceptedNameUsage has 455205 (> 99.9%) missing values Missing
parentNameUsage has 455205 (> 99.9%) missing values Missing
originalNameUsage has 455205 (> 99.9%) missing values Missing
nameAccordingTo has 455205 (> 99.9%) missing values Missing
namePublishedIn has 455205 (> 99.9%) missing values Missing
class has 444746 (97.7%) missing values Missing
superfamily has 455205 (> 99.9%) missing values Missing
subfamily has 455205 (> 99.9%) missing values Missing
subtribe has 455205 (> 99.9%) missing values Missing
genus has 23586 (5.2%) missing values Missing
genericName has 23579 (5.2%) missing values Missing
subgenus has 455206 (> 99.9%) missing values Missing
specificEpithet has 70259 (15.4%) missing values Missing
infraspecificEpithet has 447018 (98.2%) missing values Missing
cultivarEpithet has 455206 (> 99.9%) missing values Missing
verbatimTaxonRank has 455210 (> 99.9%) missing values Missing
vernacularName has 455210 (> 99.9%) missing values Missing
nomenclaturalCode has 455210 (> 99.9%) missing values Missing
nomenclaturalStatus has 455211 (> 99.9%) missing values Missing
taxonRemarks has 455211 (> 99.9%) missing values Missing
depth has 246174 (54.1%) missing values Missing
depthAccuracy has 266866 (58.6%) missing values Missing
distanceFromCentroidInMeters has 454306 (99.8%) missing values Missing
mediaType has 363819 (79.9%) missing values Missing
classKey has 444746 (97.7%) missing values Missing
genusKey has 23593 (5.2%) missing values Missing
speciesKey has 70260 (15.4%) missing values Missing
species has 70260 (15.4%) missing values Missing
repatriated has 30397 (6.7%) missing values Missing
gbifRegion has 32195 (7.1%) missing values Missing
level0Gid has 407295 (89.5%) missing values Missing
level0Name has 407295 (89.5%) missing values Missing
level1Gid has 408402 (89.7%) missing values Missing
level1Name has 408402 (89.7%) missing values Missing
level2Gid has 412023 (90.5%) missing values Missing
level2Name has 412026 (90.5%) missing values Missing
level3Gid has 441377 (97.0%) missing values Missing
level3Name has 441442 (97.0%) missing values Missing
iucnRedListCategory has 11501 (2.5%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-08 22:56:37.908163
Analysis finished2025-01-08 22:57:00.805949
Duration22.9 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct455212
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:01.107756image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters4552120
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique455212 ?
Unique (%)100.0%

Sample

1st row1317202656
2nd row1317202715
3rd row1322535976
4th row1317203467
5th row2235732924
ValueCountFrequency (%)
1317202656 1
 
< 0.1%
1322550703 1
 
< 0.1%
1317206835 1
 
< 0.1%
1322539466 1
 
< 0.1%
2235733055 1
 
< 0.1%
1322541352 1
 
< 0.1%
1843575433 1
 
< 0.1%
1843575436 1
 
< 0.1%
1322545228 1
 
< 0.1%
3467167330 1
 
< 0.1%
Other values (455202) 455202
> 99.9%
2025-01-08T17:57:01.492658image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 954765
21.0%
3 714943
15.7%
2 571998
12.6%
8 367591
 
8.1%
0 348573
 
7.7%
9 346657
 
7.6%
7 345233
 
7.6%
4 314224
 
6.9%
5 302152
 
6.6%
6 285984
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4552120
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 954765
21.0%
3 714943
15.7%
2 571998
12.6%
8 367591
 
8.1%
0 348573
 
7.7%
9 346657
 
7.6%
7 345233
 
7.6%
4 314224
 
6.9%
5 302152
 
6.6%
6 285984
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Common 4552120
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 954765
21.0%
3 714943
15.7%
2 571998
12.6%
8 367591
 
8.1%
0 348573
 
7.7%
9 346657
 
7.6%
7 345233
 
7.6%
4 314224
 
6.9%
5 302152
 
6.6%
6 285984
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4552120
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 954765
21.0%
3 714943
15.7%
2 571998
12.6%
8 367591
 
8.1%
0 348573
 
7.7%
9 346657
 
7.6%
7 345233
 
7.6%
4 314224
 
6.9%
5 302152
 
6.6%
6 285984
 
6.3%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:01.546775image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters3186484
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0_1_0
2nd rowCC0_1_0
3rd rowCC0_1_0
4th rowCC0_1_0
5th rowCC0_1_0
ValueCountFrequency (%)
cc0_1_0 455212
100.0%
2025-01-08T17:57:01.633562image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 910424
28.6%
0 910424
28.6%
_ 910424
28.6%
1 455212
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1365636
42.9%
Uppercase Letter 910424
28.6%
Connector Punctuation 910424
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 910424
66.7%
1 455212
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 910424
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 910424
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2276060
71.4%
Latin 910424
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 910424
40.0%
_ 910424
40.0%
1 455212
20.0%
Latin
ValueCountFrequency (%)
C 910424
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3186484
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 910424
28.6%
0 910424
28.6%
_ 910424
28.6%
1 455212
14.3%
Distinct55507
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:01.755151image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters9104240
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31272 ?
Unique (%)6.9%

Sample

1st row2023-06-02T12:34:00Z
2nd row2019-11-27T11:21:00Z
3rd row2018-02-21T11:18:00Z
4th row2020-03-23T11:52:00Z
5th row2019-07-18T12:15:00Z
ValueCountFrequency (%)
2022-09-13t10:13:00z 2762
 
0.6%
2015-04-16t13:10:00z 2261
 
0.5%
2018-07-27t10:48:00z 2063
 
0.5%
2017-12-01t13:03:00z 2039
 
0.4%
2017-08-29t08:37:00z 1935
 
0.4%
2018-07-27t10:44:00z 1898
 
0.4%
2019-07-18t12:17:00z 1876
 
0.4%
2017-12-18t13:20:00z 1847
 
0.4%
2017-12-04t11:22:00z 1814
 
0.4%
2019-07-18t12:15:00z 1731
 
0.4%
Other values (55497) 434986
95.6%
2025-01-08T17:57:01.933339image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2223228
24.4%
1 1214237
13.3%
2 1142482
12.5%
- 910424
10.0%
: 910424
10.0%
T 455212
 
5.0%
Z 455212
 
5.0%
4 380598
 
4.2%
8 314110
 
3.5%
3 305429
 
3.4%
Other values (4) 792884
 
8.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6372968
70.0%
Dash Punctuation 910424
 
10.0%
Other Punctuation 910424
 
10.0%
Uppercase Letter 910424
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2223228
34.9%
1 1214237
19.1%
2 1142482
17.9%
4 380598
 
6.0%
8 314110
 
4.9%
3 305429
 
4.8%
5 267556
 
4.2%
7 199859
 
3.1%
9 176646
 
2.8%
6 148823
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
T 455212
50.0%
Z 455212
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 910424
100.0%
Other Punctuation
ValueCountFrequency (%)
: 910424
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8193816
90.0%
Latin 910424
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2223228
27.1%
1 1214237
14.8%
2 1142482
13.9%
- 910424
11.1%
: 910424
11.1%
4 380598
 
4.6%
8 314110
 
3.8%
3 305429
 
3.7%
5 267556
 
3.3%
7 199859
 
2.4%
Other values (2) 325469
 
4.0%
Latin
ValueCountFrequency (%)
T 455212
50.0%
Z 455212
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9104240
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2223228
24.4%
1 1214237
13.3%
2 1142482
12.5%
- 910424
10.0%
: 910424
10.0%
T 455212
 
5.0%
Z 455212
 
5.0%
4 380598
 
4.2%
8 314110
 
3.5%
3 305429
 
3.4%
Other values (4) 792884
 
8.7%

publisher
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:01.995816image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length59
Mean length59
Min length59

Characters and Unicode

Total characters26857508
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNational Museum of Natural History, Smithsonian Institution
2nd rowNational Museum of Natural History, Smithsonian Institution
3rd rowNational Museum of Natural History, Smithsonian Institution
4th rowNational Museum of Natural History, Smithsonian Institution
5th rowNational Museum of Natural History, Smithsonian Institution
ValueCountFrequency (%)
national 455212
14.3%
museum 455212
14.3%
of 455212
14.3%
natural 455212
14.3%
history 455212
14.3%
smithsonian 455212
14.3%
institution 455212
14.3%
2025-01-08T17:57:02.096895image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 3186484
11.9%
i 2731272
10.2%
2731272
10.2%
a 2276060
 
8.5%
o 2276060
 
8.5%
n 2276060
 
8.5%
s 1820848
 
6.8%
u 1820848
 
6.8%
r 910424
 
3.4%
m 910424
 
3.4%
Other values (11) 5917756
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20939752
78.0%
Space Separator 2731272
 
10.2%
Uppercase Letter 2731272
 
10.2%
Other Punctuation 455212
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 3186484
15.2%
i 2731272
13.0%
a 2276060
10.9%
o 2276060
10.9%
n 2276060
10.9%
s 1820848
8.7%
u 1820848
8.7%
r 910424
 
4.3%
m 910424
 
4.3%
l 910424
 
4.3%
Other values (4) 1820848
8.7%
Uppercase Letter
ValueCountFrequency (%)
N 910424
33.3%
M 455212
16.7%
H 455212
16.7%
S 455212
16.7%
I 455212
16.7%
Space Separator
ValueCountFrequency (%)
2731272
100.0%
Other Punctuation
ValueCountFrequency (%)
, 455212
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23671024
88.1%
Common 3186484
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 3186484
13.5%
i 2731272
11.5%
a 2276060
9.6%
o 2276060
9.6%
n 2276060
9.6%
s 1820848
 
7.7%
u 1820848
 
7.7%
r 910424
 
3.8%
m 910424
 
3.8%
N 910424
 
3.8%
Other values (9) 4552120
19.2%
Common
ValueCountFrequency (%)
2731272
85.7%
, 455212
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26857508
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 3186484
11.9%
i 2731272
10.2%
2731272
10.2%
a 2276060
 
8.5%
o 2276060
 
8.5%
n 2276060
 
8.5%
s 1820848
 
6.8%
u 1820848
 
6.8%
r 910424
 
3.4%
m 910424
 
3.4%
Other values (11) 5917756
22.0%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:02.148188image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters13201148
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 455212
100.0%
2025-01-08T17:57:02.241036image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 1820848
13.8%
: 1820848
13.8%
l 1365636
 
10.3%
i 910424
 
6.9%
r 910424
 
6.9%
c 910424
 
6.9%
g 455212
 
3.4%
7 455212
 
3.4%
8 455212
 
3.4%
4 455212
 
3.4%
Other values (8) 3641696
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8649028
65.5%
Other Punctuation 2276060
 
17.2%
Decimal Number 2276060
 
17.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1820848
21.1%
l 1365636
15.8%
i 910424
10.5%
r 910424
10.5%
c 910424
10.5%
g 455212
 
5.3%
u 455212
 
5.3%
b 455212
 
5.3%
d 455212
 
5.3%
s 455212
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 455212
20.0%
8 455212
20.0%
4 455212
20.0%
3 455212
20.0%
1 455212
20.0%
Other Punctuation
ValueCountFrequency (%)
: 1820848
80.0%
. 455212
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8649028
65.5%
Common 4552120
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 1820848
21.1%
l 1365636
15.8%
i 910424
10.5%
r 910424
10.5%
c 910424
10.5%
g 455212
 
5.3%
u 455212
 
5.3%
b 455212
 
5.3%
d 455212
 
5.3%
s 455212
 
5.3%
Common
ValueCountFrequency (%)
: 1820848
40.0%
7 455212
 
10.0%
8 455212
 
10.0%
4 455212
 
10.0%
3 455212
 
10.0%
. 455212
 
10.0%
1 455212
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13201148
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 1820848
13.8%
: 1820848
13.8%
l 1365636
 
10.3%
i 910424
 
6.9%
r 910424
 
6.9%
c 910424
 
6.9%
g 455212
 
3.4%
7 455212
 
3.4%
8 455212
 
3.4%
4 455212
 
3.4%
Other values (8) 3641696
27.6%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:02.297035image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters20484540
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f
2nd rowurn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f
3rd rowurn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f
4th rowurn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f
5th rowurn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f
ValueCountFrequency (%)
urn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f 455212
100.0%
2025-01-08T17:57:02.398527image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 2731272
13.3%
f 2276060
11.1%
9 1820848
8.9%
- 1820848
8.9%
d 1820848
8.9%
b 1365636
 
6.7%
5 1365636
 
6.7%
u 1365636
 
6.7%
: 910424
 
4.4%
3 910424
 
4.4%
Other values (8) 4096908
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10925088
53.3%
Decimal Number 6828180
33.3%
Dash Punctuation 1820848
 
8.9%
Other Punctuation 910424
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 2731272
25.0%
f 2276060
20.8%
d 1820848
16.7%
b 1365636
12.5%
u 1365636
12.5%
r 455212
 
4.2%
i 455212
 
4.2%
n 455212
 
4.2%
Decimal Number
ValueCountFrequency (%)
9 1820848
26.7%
5 1365636
20.0%
3 910424
13.3%
8 910424
13.3%
0 455212
 
6.7%
4 455212
 
6.7%
6 455212
 
6.7%
1 455212
 
6.7%
Dash Punctuation
ValueCountFrequency (%)
- 1820848
100.0%
Other Punctuation
ValueCountFrequency (%)
: 910424
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10925088
53.3%
Common 9559452
46.7%

Most frequent character per script

Common
ValueCountFrequency (%)
9 1820848
19.0%
- 1820848
19.0%
5 1365636
14.3%
: 910424
9.5%
3 910424
9.5%
8 910424
9.5%
0 455212
 
4.8%
4 455212
 
4.8%
6 455212
 
4.8%
1 455212
 
4.8%
Latin
ValueCountFrequency (%)
c 2731272
25.0%
f 2276060
20.8%
d 1820848
16.7%
b 1365636
12.5%
u 1365636
12.5%
r 455212
 
4.2%
i 455212
 
4.2%
n 455212
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20484540
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 2731272
13.3%
f 2276060
11.1%
9 1820848
8.9%
- 1820848
8.9%
d 1820848
8.9%
b 1365636
 
6.7%
5 1365636
 
6.7%
u 1365636
 
6.7%
: 910424
 
4.4%
3 910424
 
4.4%
Other values (8) 4096908
20.0%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:02.437267image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1820848
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 455212
100.0%
2025-01-08T17:57:02.525165image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 455212
25.0%
S 455212
25.0%
N 455212
25.0%
M 455212
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1820848
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 455212
25.0%
S 455212
25.0%
N 455212
25.0%
M 455212
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1820848
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 455212
25.0%
S 455212
25.0%
N 455212
25.0%
M 455212
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1820848
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 455212
25.0%
S 455212
25.0%
N 455212
25.0%
M 455212
25.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:02.564165image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1820848
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFISH
2nd rowFISH
3rd rowFISH
4th rowFISH
5th rowFISH
ValueCountFrequency (%)
fish 455212
100.0%
2025-01-08T17:57:02.649302image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
F 455212
25.0%
I 455212
25.0%
S 455212
25.0%
H 455212
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1820848
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 455212
25.0%
I 455212
25.0%
S 455212
25.0%
H 455212
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1820848
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 455212
25.0%
I 455212
25.0%
S 455212
25.0%
H 455212
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1820848
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 455212
25.0%
I 455212
25.0%
S 455212
25.0%
H 455212
25.0%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:02.690119image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters8649028
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 455212
33.3%
extant 455212
33.3%
biology 455212
33.3%
2025-01-08T17:57:02.777461image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 910424
 
10.5%
910424
 
10.5%
t 910424
 
10.5%
o 910424
 
10.5%
M 455212
 
5.3%
H 455212
 
5.3%
E 455212
 
5.3%
x 455212
 
5.3%
a 455212
 
5.3%
n 455212
 
5.3%
Other values (5) 2276060
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5007332
57.9%
Uppercase Letter 2731272
31.6%
Space Separator 910424
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 910424
18.2%
o 910424
18.2%
x 455212
9.1%
a 455212
9.1%
n 455212
9.1%
i 455212
9.1%
l 455212
9.1%
g 455212
9.1%
y 455212
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 910424
33.3%
M 455212
16.7%
H 455212
16.7%
E 455212
16.7%
B 455212
16.7%
Space Separator
ValueCountFrequency (%)
910424
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7738604
89.5%
Common 910424
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 910424
11.8%
t 910424
11.8%
o 910424
11.8%
M 455212
 
5.9%
H 455212
 
5.9%
E 455212
 
5.9%
x 455212
 
5.9%
a 455212
 
5.9%
n 455212
 
5.9%
B 455212
 
5.9%
Other values (4) 1820848
23.5%
Common
ValueCountFrequency (%)
910424
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8649028
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 910424
 
10.5%
910424
 
10.5%
t 910424
 
10.5%
o 910424
 
10.5%
M 455212
 
5.3%
H 455212
 
5.3%
E 455212
 
5.3%
x 455212
 
5.3%
a 455212
 
5.3%
n 455212
 
5.3%
Other values (5) 2276060
26.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:02.824462image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length18
Mean length18.08130717
Min length18

Characters and Unicode

Total characters8230828
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESERVED_SPECIMEN
2nd rowPRESERVED_SPECIMEN
3rd rowPRESERVED_SPECIMEN
4th rowPRESERVED_SPECIMEN
5th rowMACHINE_OBSERVATION
ValueCountFrequency (%)
preserved_specimen 418200
91.9%
machine_observation 37012
 
8.1%
2025-01-08T17:57:02.920861image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 2165024
26.3%
R 873412
10.6%
S 873412
10.6%
P 836400
 
10.2%
I 492224
 
6.0%
N 492224
 
6.0%
V 455212
 
5.5%
_ 455212
 
5.5%
C 455212
 
5.5%
M 455212
 
5.5%
Other values (6) 677284
 
8.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7775616
94.5%
Connector Punctuation 455212
 
5.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 2165024
27.8%
R 873412
11.2%
S 873412
11.2%
P 836400
 
10.8%
I 492224
 
6.3%
N 492224
 
6.3%
V 455212
 
5.9%
C 455212
 
5.9%
M 455212
 
5.9%
D 418200
 
5.4%
Other values (5) 259084
 
3.3%
Connector Punctuation
ValueCountFrequency (%)
_ 455212
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7775616
94.5%
Common 455212
 
5.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 2165024
27.8%
R 873412
11.2%
S 873412
11.2%
P 836400
 
10.8%
I 492224
 
6.3%
N 492224
 
6.3%
V 455212
 
5.9%
C 455212
 
5.9%
M 455212
 
5.9%
D 418200
 
5.4%
Other values (5) 259084
 
3.3%
Common
ValueCountFrequency (%)
_ 455212
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8230828
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 2165024
26.3%
R 873412
10.6%
S 873412
10.6%
P 836400
 
10.2%
I 492224
 
6.0%
N 492224
 
6.0%
V 455212
 
5.5%
_ 455212
 
5.5%
C 455212
 
5.5%
M 455212
 
5.5%
Other values (6) 677284
 
8.2%

occurrenceID
Text

Unique 

Distinct455212
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:03.166620image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters28678356
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique455212 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/30002bab5-5433-4b6c-8496-286a4a697fd7
2nd rowhttp://n2t.net/ark:/65665/3000315ff-b613-4f47-813c-5c48d8e0a883
3rd rowhttp://n2t.net/ark:/65665/3ebef4ab3-d946-4961-9221-c7c9692640f8
4th rowhttp://n2t.net/ark:/65665/3000bbb81-e139-47f8-b2bc-db762804769d
5th rowhttp://n2t.net/ark:/65665/3002333ca-4702-4d0d-93cd-265885eff56a
ValueCountFrequency (%)
http://n2t.net/ark:/65665/30002bab5-5433-4b6c-8496-286a4a697fd7 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec95962f-5e2d-41b1-9854-38d77fc2256f 1
 
< 0.1%
http://n2t.net/ark:/65665/300319b93-d6b8-4b79-b23d-d3825483b706 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec168a54-17b9-4a71-9ecc-92d446311c64 1
 
< 0.1%
http://n2t.net/ark:/65665/300370a00-b7af-441b-87cd-9c14a7b5b464 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec2b37a7-5b92-4aa2-ad75-a247e8e353f8 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec3b1e55-b813-49b8-83c7-eadc9323514e 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec3f4a3d-f79a-4cba-b8e4-022b57aa27d6 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec57b74b-7fc8-4006-ad93-945fb0784573 1
 
< 0.1%
http://n2t.net/ark:/65665/30280d648-bf0c-4907-af3e-5cdc58054b4e 1
 
< 0.1%
Other values (455202) 455202
> 99.9%
2025-01-08T17:57:03.469323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 2276060
 
7.9%
6 2219127
 
7.7%
- 1820848
 
6.3%
t 1820848
 
6.3%
5 1762661
 
6.1%
a 1423503
 
5.0%
2 1308759
 
4.6%
4 1308747
 
4.6%
e 1308683
 
4.6%
3 1308534
 
4.6%
Other values (16) 12120586
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12402148
43.2%
Lowercase Letter 10813664
37.7%
Other Punctuation 3641696
 
12.7%
Dash Punctuation 1820848
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1820848
16.8%
a 1423503
13.2%
e 1308683
12.1%
b 969407
9.0%
n 910424
8.4%
f 853663
7.9%
d 853473
7.9%
c 852815
7.9%
k 455212
 
4.2%
r 455212
 
4.2%
Other values (2) 910424
8.4%
Decimal Number
ValueCountFrequency (%)
6 2219127
17.9%
5 1762661
14.2%
2 1308759
10.6%
4 1308747
10.6%
3 1308534
10.6%
9 967258
7.8%
8 966762
7.8%
0 854227
 
6.9%
1 853581
 
6.9%
7 852492
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 2276060
62.5%
: 910424
 
25.0%
. 455212
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 1820848
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17864692
62.3%
Latin 10813664
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 2276060
12.7%
6 2219127
12.4%
- 1820848
10.2%
5 1762661
9.9%
2 1308759
7.3%
4 1308747
7.3%
3 1308534
7.3%
9 967258
 
5.4%
8 966762
 
5.4%
: 910424
 
5.1%
Other values (4) 3015512
16.9%
Latin
ValueCountFrequency (%)
t 1820848
16.8%
a 1423503
13.2%
e 1308683
12.1%
b 969407
9.0%
n 910424
8.4%
f 853663
7.9%
d 853473
7.9%
c 852815
7.9%
k 455212
 
4.2%
r 455212
 
4.2%
Other values (2) 910424
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28678356
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 2276060
 
7.9%
6 2219127
 
7.7%
- 1820848
 
6.3%
t 1820848
 
6.3%
5 1762661
 
6.1%
a 1423503
 
5.0%
2 1308759
 
4.6%
4 1308747
 
4.6%
e 1308683
 
4.6%
3 1308534
 
4.6%
Other values (16) 12120586
42.3%
Distinct455204
Distinct (%)> 99.9%
Missing3
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:03.860366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length11
Mean length11.04621613
Min length6

Characters and Unicode

Total characters5028337
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique455199 ?
Unique (%)> 99.9%

Sample

1st rowUSNM 51082
2nd rowUSNM 110432
3rd rowUSNM 49860
4th rowUSNM 239751
5th rowUSNM RAD122557
ValueCountFrequency (%)
usnm 455209
50.0%
465983 2
 
< 0.1%
466814 2
 
< 0.1%
135878 2
 
< 0.1%
114351 2
 
< 0.1%
rad125895 2
 
< 0.1%
fin30680 1
 
< 0.1%
253658 1
 
< 0.1%
457486 1
 
< 0.1%
97025 1
 
< 0.1%
Other values (455195) 455195
50.0%
2025-01-08T17:57:04.307853image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 464924
 
9.2%
U 455209
 
9.1%
M 455209
 
9.1%
455209
 
9.1%
S 455209
 
9.1%
1 348722
 
6.9%
2 335206
 
6.7%
3 330422
 
6.6%
4 286516
 
5.7%
6 228754
 
4.5%
Other values (10) 1212957
24.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2644490
52.6%
Uppercase Letter 1928638
38.4%
Space Separator 455209
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 348722
13.2%
2 335206
12.7%
3 330422
12.5%
4 286516
10.8%
6 228754
8.7%
0 227699
8.6%
7 226592
8.6%
5 223329
8.4%
9 219353
8.3%
8 217897
8.2%
Uppercase Letter
ValueCountFrequency (%)
N 464924
24.1%
U 455209
23.6%
M 455209
23.6%
S 455209
23.6%
D 26219
 
1.4%
A 26219
 
1.4%
R 26219
 
1.4%
F 9715
 
0.5%
I 9715
 
0.5%
Space Separator
ValueCountFrequency (%)
455209
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3099699
61.6%
Latin 1928638
38.4%

Most frequent character per script

Common
ValueCountFrequency (%)
455209
14.7%
1 348722
11.3%
2 335206
10.8%
3 330422
10.7%
4 286516
9.2%
6 228754
7.4%
0 227699
7.3%
7 226592
7.3%
5 223329
7.2%
9 219353
7.1%
Latin
ValueCountFrequency (%)
N 464924
24.1%
U 455209
23.6%
M 455209
23.6%
S 455209
23.6%
D 26219
 
1.4%
A 26219
 
1.4%
R 26219
 
1.4%
F 9715
 
0.5%
I 9715
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5028337
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 464924
 
9.2%
U 455209
 
9.1%
M 455209
 
9.1%
455209
 
9.1%
S 455209
 
9.1%
1 348722
 
6.9%
2 335206
 
6.7%
3 330422
 
6.6%
4 286516
 
5.7%
6 228754
 
4.5%
Other values (10) 1212957
24.1%

recordNumber
Text

Missing 

Distinct20814
Distinct (%)99.9%
Missing434386
Missing (%)95.4%
Memory size3.5 MiB
2025-01-08T17:57:04.483263image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length42
Median length8
Mean length8.388456737
Min length1

Characters and Unicode

Total characters174698
Distinct characters64
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20803 ?
Unique (%)99.9%

Sample

1st rowPHISH-032
2nd rowAUST-251
3rd rowMOC11646
4th rowRP-202
5th rowSCIL-052
ValueCountFrequency (%)
blz 1430
 
5.5%
bah 710
 
2.8%
tci 681
 
2.6%
sms 536
 
2.1%
cur 426
 
1.7%
tob 393
 
1.5%
twn 280
 
1.1%
hbb 157
 
0.6%
fcc 146
 
0.6%
keb&mgg 111
 
0.4%
Other values (18988) 20921
81.1%
2025-01-08T17:57:04.708688image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 16672
 
9.5%
0 13195
 
7.6%
- 11059
 
6.3%
2 9655
 
5.5%
3 7495
 
4.3%
7 6732
 
3.9%
4 6594
 
3.8%
9 6304
 
3.6%
S 6075
 
3.5%
I 5691
 
3.3%
Other values (54) 85226
48.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 83146
47.6%
Uppercase Letter 68573
39.3%
Dash Punctuation 11059
 
6.3%
Lowercase Letter 6363
 
3.6%
Space Separator 4965
 
2.8%
Connector Punctuation 199
 
0.1%
Other Punctuation 135
 
0.1%
Close Punctuation 129
 
0.1%
Open Punctuation 129
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 6075
 
8.9%
I 5691
 
8.3%
H 4851
 
7.1%
C 4741
 
6.9%
R 4572
 
6.7%
B 4223
 
6.2%
P 3828
 
5.6%
L 3820
 
5.6%
M 3805
 
5.5%
U 3666
 
5.3%
Other values (16) 23301
34.0%
Lowercase Letter
ValueCountFrequency (%)
i 1435
22.6%
m 1341
21.1%
b 1290
20.3%
o 1289
20.3%
n 256
 
4.0%
a 147
 
2.3%
y 140
 
2.2%
u 138
 
2.2%
q 136
 
2.1%
t 133
 
2.1%
Other values (7) 58
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 16672
20.1%
0 13195
15.9%
2 9655
11.6%
3 7495
9.0%
7 6732
8.1%
4 6594
 
7.9%
9 6304
 
7.6%
8 5653
 
6.8%
5 5523
 
6.6%
6 5323
 
6.4%
Other Punctuation
ValueCountFrequency (%)
& 111
82.2%
; 9
 
6.7%
. 9
 
6.7%
* 4
 
3.0%
: 1
 
0.7%
? 1
 
0.7%
Dash Punctuation
ValueCountFrequency (%)
- 11059
100.0%
Space Separator
ValueCountFrequency (%)
4965
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 199
100.0%
Close Punctuation
ValueCountFrequency (%)
) 129
100.0%
Open Punctuation
ValueCountFrequency (%)
( 129
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 99762
57.1%
Latin 74936
42.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 6075
 
8.1%
I 5691
 
7.6%
H 4851
 
6.5%
C 4741
 
6.3%
R 4572
 
6.1%
B 4223
 
5.6%
P 3828
 
5.1%
L 3820
 
5.1%
M 3805
 
5.1%
U 3666
 
4.9%
Other values (33) 29664
39.6%
Common
ValueCountFrequency (%)
1 16672
16.7%
0 13195
13.2%
- 11059
11.1%
2 9655
9.7%
3 7495
7.5%
7 6732
6.7%
4 6594
 
6.6%
9 6304
 
6.3%
8 5653
 
5.7%
5 5523
 
5.5%
Other values (11) 10880
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 174698
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 16672
 
9.5%
0 13195
 
7.6%
- 11059
 
6.3%
2 9655
 
5.5%
3 7495
 
4.3%
7 6732
 
3.9%
4 6594
 
3.8%
9 6304
 
3.6%
S 6075
 
3.5%
I 5691
 
3.3%
Other values (54) 85226
48.8%

recordedBy
Text

Missing 

Distinct7883
Distinct (%)4.7%
Missing287312
Missing (%)63.1%
Memory size3.5 MiB
2025-01-08T17:57:04.880570image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length240
Median length115
Mean length26.11823109
Min length1

Characters and Unicode

Total characters4385251
Distinct characters76
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3022 ?
Unique (%)1.8%

Sample

1st rowJ. Snyder
2nd rowD. Richardson
3rd rowSmithsonian Team, A. Alcala & Silliman University Group
4th rowBronson
5th rowG. Hendler
ValueCountFrequency (%)
77670
 
9.1%
j 42738
 
5.0%
m 36874
 
4.3%
d 28293
 
3.3%
r 27606
 
3.2%
c 22145
 
2.6%
l 20146
 
2.3%
h 19636
 
2.3%
s 18374
 
2.1%
a 17770
 
2.1%
Other values (4981) 546427
63.7%
2025-01-08T17:57:05.133052image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
689779
15.7%
. 354996
 
8.1%
e 287100
 
6.5%
a 267898
 
6.1%
r 202145
 
4.6%
n 201834
 
4.6%
i 198853
 
4.5%
o 170908
 
3.9%
l 161879
 
3.7%
t 156528
 
3.6%
Other values (66) 1693331
38.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2340145
53.4%
Uppercase Letter 780626
 
17.8%
Space Separator 689779
 
15.7%
Other Punctuation 562437
 
12.8%
Dash Punctuation 6626
 
0.2%
Open Punctuation 2738
 
0.1%
Close Punctuation 2738
 
0.1%
Decimal Number 162
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 287100
12.3%
a 267898
11.4%
r 202145
8.6%
n 201834
8.6%
i 198853
8.5%
o 170908
 
7.3%
l 161879
 
6.9%
t 156528
 
6.7%
s 132876
 
5.7%
h 80006
 
3.4%
Other values (19) 480118
20.5%
Uppercase Letter
ValueCountFrequency (%)
M 78820
 
10.1%
S 64727
 
8.3%
C 58544
 
7.5%
B 55964
 
7.2%
J 53696
 
6.9%
R 51837
 
6.6%
H 41812
 
5.4%
P 41068
 
5.3%
D 40847
 
5.2%
W 39861
 
5.1%
Other values (16) 253450
32.5%
Decimal Number
ValueCountFrequency (%)
9 39
24.1%
0 26
16.0%
1 21
13.0%
8 21
13.0%
3 17
10.5%
2 16
9.9%
7 11
 
6.8%
6 5
 
3.1%
4 5
 
3.1%
5 1
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 354996
63.1%
, 129603
 
23.0%
& 77581
 
13.8%
' 240
 
< 0.1%
# 10
 
< 0.1%
? 5
 
< 0.1%
/ 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
689779
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6626
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2738
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2738
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3120771
71.2%
Common 1264480
28.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 287100
 
9.2%
a 267898
 
8.6%
r 202145
 
6.5%
n 201834
 
6.5%
i 198853
 
6.4%
o 170908
 
5.5%
l 161879
 
5.2%
t 156528
 
5.0%
s 132876
 
4.3%
h 80006
 
2.6%
Other values (45) 1260744
40.4%
Common
ValueCountFrequency (%)
689779
54.6%
. 354996
28.1%
, 129603
 
10.2%
& 77581
 
6.1%
- 6626
 
0.5%
( 2738
 
0.2%
) 2738
 
0.2%
' 240
 
< 0.1%
9 39
 
< 0.1%
0 26
 
< 0.1%
Other values (11) 114
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4385214
> 99.9%
None 37
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
689779
15.7%
. 354996
 
8.1%
e 287100
 
6.5%
a 267898
 
6.1%
r 202145
 
4.6%
n 201834
 
4.6%
i 198853
 
4.5%
o 170908
 
3.9%
l 161879
 
3.7%
t 156528
 
3.6%
Other values (63) 1693294
38.6%
None
ValueCountFrequency (%)
ü 32
86.5%
ô 4
 
10.8%
í 1
 
2.7%
Distinct619
Distinct (%)0.1%
Missing15
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:05.293630image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length1
Mean length1.121072854
Min length1

Characters and Unicode

Total characters510309
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique251 ?
Unique (%)0.1%

Sample

1st row1
2nd row1
3rd row9
4th row12
5th row1
ValueCountFrequency (%)
1 245716
54.0%
2 61709
 
13.6%
3 30726
 
6.8%
4 19436
 
4.3%
5 14092
 
3.1%
6 10260
 
2.3%
7 7454
 
1.6%
10 6775
 
1.5%
8 6055
 
1.3%
9 4855
 
1.1%
Other values (609) 48119
 
10.6%
2025-01-08T17:57:05.498107image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 279516
54.8%
2 78463
 
15.4%
3 39933
 
7.8%
4 26278
 
5.1%
5 23283
 
4.6%
0 18677
 
3.7%
6 14874
 
2.9%
7 11614
 
2.3%
8 9547
 
1.9%
9 8124
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 510309
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 279516
54.8%
2 78463
 
15.4%
3 39933
 
7.8%
4 26278
 
5.1%
5 23283
 
4.6%
0 18677
 
3.7%
6 14874
 
2.9%
7 11614
 
2.3%
8 9547
 
1.9%
9 8124
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Common 510309
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 279516
54.8%
2 78463
 
15.4%
3 39933
 
7.8%
4 26278
 
5.1%
5 23283
 
4.6%
0 18677
 
3.7%
6 14874
 
2.9%
7 11614
 
2.3%
8 9547
 
1.9%
9 8124
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 510309
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 279516
54.8%
2 78463
 
15.4%
3 39933
 
7.8%
4 26278
 
5.1%
5 23283
 
4.6%
0 18677
 
3.7%
6 14874
 
2.9%
7 11614
 
2.3%
8 9547
 
1.9%
9 8124
 
1.6%

sex
Text

Constant  Missing 

Distinct1
Distinct (%)33.3%
Missing455209
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:05.542651image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters12
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMALE
2nd rowMALE
3rd rowMALE
ValueCountFrequency (%)
male 3
100.0%
2025-01-08T17:57:05.630133image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 3
25.0%
A 3
25.0%
L 3
25.0%
E 3
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 12
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 3
25.0%
A 3
25.0%
L 3
25.0%
E 3
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 3
25.0%
A 3
25.0%
L 3
25.0%
E 3
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 3
25.0%
A 3
25.0%
L 3
25.0%
E 3
25.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:05.669679image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.999995606
Min length6

Characters and Unicode

Total characters3186482
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESENT
2nd rowPRESENT
3rd rowPRESENT
4th rowPRESENT
5th rowPRESENT
ValueCountFrequency (%)
present 455210
> 99.9%
absent 2
 
< 0.1%
2025-01-08T17:57:05.758283image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 910422
28.6%
S 455212
14.3%
N 455212
14.3%
T 455212
14.3%
P 455210
14.3%
R 455210
14.3%
A 2
 
< 0.1%
B 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3186482
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 910422
28.6%
S 455212
14.3%
N 455212
14.3%
T 455212
14.3%
P 455210
14.3%
R 455210
14.3%
A 2
 
< 0.1%
B 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3186482
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 910422
28.6%
S 455212
14.3%
N 455212
14.3%
T 455212
14.3%
P 455210
14.3%
R 455210
14.3%
A 2
 
< 0.1%
B 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3186482
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 910422
28.6%
S 455212
14.3%
N 455212
14.3%
T 455212
14.3%
P 455210
14.3%
R 455210
14.3%
A 2
 
< 0.1%
B 2
 
< 0.1%

preparations
Text

Missing 

Distinct325
Distinct (%)0.3%
Missing346184
Missing (%)76.0%
Memory size3.5 MiB
2025-01-08T17:57:05.824956image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length255
Median length192
Mean length11.80351836
Min length4

Characters and Unicode

Total characters1286914
Distinct characters56
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique141 ?
Unique (%)0.1%

Sample

1st rowDry Osteological Specimen
2nd rowGlycerin with Bone Stain
3rd rowPolyester
4th rowLarvae [ETOH Fixed]
5th rowUnknown
ValueCountFrequency (%)
larvae 25640
14.6%
polyester 20066
 
11.4%
photograph 14070
 
8.0%
unknown 11506
 
6.6%
film 9617
 
5.5%
specimen 8056
 
4.6%
osteological 7025
 
4.0%
glycerin 7019
 
4.0%
with 7017
 
4.0%
stain 7012
 
4.0%
Other values (60) 58274
33.2%
2025-01-08T17:57:05.969437image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 130540
 
10.1%
a 117241
 
9.1%
o 94955
 
7.4%
r 91527
 
7.1%
t 83181
 
6.5%
n 69625
 
5.4%
l 66760
 
5.2%
i 66635
 
5.2%
66274
 
5.1%
h 41878
 
3.3%
Other values (46) 458298
35.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1023240
79.5%
Uppercase Letter 182950
 
14.2%
Space Separator 66274
 
5.1%
Other Punctuation 6530
 
0.5%
Open Punctuation 3927
 
0.3%
Close Punctuation 3927
 
0.3%
Dash Punctuation 60
 
< 0.1%
Decimal Number 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 130540
12.8%
a 117241
11.5%
o 94955
9.3%
r 91527
8.9%
t 83181
 
8.1%
n 69625
 
6.8%
l 66760
 
6.5%
i 66635
 
6.5%
h 41878
 
4.1%
y 35275
 
3.4%
Other values (13) 225623
22.0%
Uppercase Letter
ValueCountFrequency (%)
P 34352
18.8%
L 26509
14.5%
S 15784
8.6%
F 15061
8.2%
O 12528
 
6.8%
U 11506
 
6.3%
D 9088
 
5.0%
E 7134
 
3.9%
G 7019
 
3.8%
A 6981
 
3.8%
Other values (11) 36988
20.2%
Other Punctuation
ValueCountFrequency (%)
; 6172
94.5%
. 186
 
2.8%
& 89
 
1.4%
, 78
 
1.2%
% 5
 
0.1%
Decimal Number
ValueCountFrequency (%)
3 4
66.7%
9 1
 
16.7%
5 1
 
16.7%
Space Separator
ValueCountFrequency (%)
66274
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 3927
100.0%
Close Punctuation
ValueCountFrequency (%)
] 3927
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 60
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1206190
93.7%
Common 80724
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 130540
 
10.8%
a 117241
 
9.7%
o 94955
 
7.9%
r 91527
 
7.6%
t 83181
 
6.9%
n 69625
 
5.8%
l 66760
 
5.5%
i 66635
 
5.5%
h 41878
 
3.5%
y 35275
 
2.9%
Other values (34) 408573
33.9%
Common
ValueCountFrequency (%)
66274
82.1%
; 6172
 
7.6%
[ 3927
 
4.9%
] 3927
 
4.9%
. 186
 
0.2%
& 89
 
0.1%
, 78
 
0.1%
- 60
 
0.1%
% 5
 
< 0.1%
3 4
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1286914
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 130540
 
10.1%
a 117241
 
9.1%
o 94955
 
7.4%
r 91527
 
7.1%
t 83181
 
6.5%
n 69625
 
5.4%
l 66760
 
5.2%
i 66635
 
5.2%
66274
 
5.1%
h 41878
 
3.3%
Other values (46) 458298
35.6%

associatedSequences
Text

Missing 

Distinct447
Distinct (%)99.3%
Missing454762
Missing (%)99.9%
Memory size3.5 MiB
2025-01-08T17:57:06.043507image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length249
Median length49
Mean length59.88888889
Min length49

Characters and Unicode

Total characters26950
Distinct characters51
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique444 ?
Unique (%)98.7%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=FJ609901
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=HQ600890
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=HM748411
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=HQ600884
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MN621852
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=mn549761 2
 
0.4%
https://www.ncbi.nlm.nih.gov/gquery?term=hq543050 2
 
0.4%
https://www.ncbi.nlm.nih.gov/gquery?term=hq543049 2
 
0.4%
https://www.ncbi.nlm.nih.gov/gquery?term=gq367323 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=hq543043 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=hq600884 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=mn621852 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=hq325698;https://www.ncbi.nlm.nih.gov/gquery?term=hq325631 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=hm748370 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=ef536294;https://www.ncbi.nlm.nih.gov/gquery?term=ef536256;https://www.ncbi.nlm.nih.gov/gquery?term=ef539241;https://www.ncbi.nlm.nih.gov/gquery?term=ef533917;https://www.ncbi.nlm.nih.gov/gquery?term=ef530094 1
 
0.2%
Other values (437) 437
97.1%
2025-01-08T17:57:06.171357image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 2192
 
8.1%
t 1644
 
6.1%
/ 1644
 
6.1%
w 1644
 
6.1%
n 1644
 
6.1%
h 1096
 
4.1%
r 1096
 
4.1%
e 1096
 
4.1%
i 1096
 
4.1%
m 1096
 
4.1%
Other values (41) 12702
47.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16988
63.0%
Other Punctuation 5030
 
18.7%
Decimal Number 3288
 
12.2%
Uppercase Letter 1096
 
4.1%
Math Symbol 548
 
2.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1644
 
9.7%
w 1644
 
9.7%
n 1644
 
9.7%
h 1096
 
6.5%
r 1096
 
6.5%
e 1096
 
6.5%
i 1096
 
6.5%
m 1096
 
6.5%
g 1096
 
6.5%
v 548
 
3.2%
Other values (9) 4932
29.0%
Uppercase Letter
ValueCountFrequency (%)
Q 219
20.0%
H 165
15.1%
F 141
12.9%
M 122
11.1%
G 99
9.0%
J 85
 
7.8%
A 83
 
7.6%
N 75
 
6.8%
Y 30
 
2.7%
E 29
 
2.6%
Other values (6) 48
 
4.4%
Decimal Number
ValueCountFrequency (%)
6 439
13.4%
0 417
12.7%
3 406
12.3%
4 396
12.0%
7 382
11.6%
9 352
10.7%
5 289
8.8%
8 261
7.9%
2 201
6.1%
1 145
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 2192
43.6%
/ 1644
32.7%
? 548
 
10.9%
: 548
 
10.9%
; 98
 
1.9%
Math Symbol
ValueCountFrequency (%)
= 548
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18084
67.1%
Common 8866
32.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 1644
 
9.1%
w 1644
 
9.1%
n 1644
 
9.1%
h 1096
 
6.1%
r 1096
 
6.1%
e 1096
 
6.1%
i 1096
 
6.1%
m 1096
 
6.1%
g 1096
 
6.1%
v 548
 
3.0%
Other values (25) 6028
33.3%
Common
ValueCountFrequency (%)
. 2192
24.7%
/ 1644
18.5%
= 548
 
6.2%
? 548
 
6.2%
: 548
 
6.2%
6 439
 
5.0%
0 417
 
4.7%
3 406
 
4.6%
4 396
 
4.5%
7 382
 
4.3%
Other values (6) 1346
15.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26950
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 2192
 
8.1%
t 1644
 
6.1%
/ 1644
 
6.1%
w 1644
 
6.1%
n 1644
 
6.1%
h 1096
 
4.1%
r 1096
 
4.1%
e 1096
 
4.1%
i 1096
 
4.1%
m 1096
 
4.1%
Other values (41) 12702
47.1%

occurrenceRemarks
Text

Missing 

Distinct80318
Distinct (%)48.8%
Missing290485
Missing (%)63.8%
Memory size3.5 MiB
2025-01-08T17:57:06.347986image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length220700
Median length28061
Mean length75.7703473
Min length1

Characters and Unicode

Total characters12481422
Distinct characters122
Distinct categories15 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70650 ?
Unique (%)42.9%

Sample

1st rowNote in ledger: " pair of otoliths"; Otoliths are stored in the Osteo Collection.; Stored in Osteo Collection.; The ototliths are stored in Mugil Box 1 of 1, which contains catalog numbers: 110428, 110429, 110430, 110431, 110432, 110433, 110434, 110435, 110436, 110438, 110439, 110440, and 110441.
2nd rowCat. no. 105
3rd rowHost-bohadschia argus. rec from: truett, d. f.
4th rowSpecimen measurements as written on the slide mount: SL (mm)= 205; TL (mm)= 10" (254); This material is part of the John and Helen Randall Slide Collection. The slides were digitized October 2017. The Randall donation includes all intellectual property rights.; Black paint/goop on the film. Not obscuring specimen.
5th rowSpecimen measurements as written on the slide mount: SL (mm)= 57; TL (mm)= 2.8" (71); This material is part of the John and Helen Randall Slide Collection. The slides were digitized October 2017. The Randall donation includes all intellectual property rights.
ValueCountFrequency (%)
the 73553
 
3.9%
of 50804
 
2.7%
in 34960
 
1.8%
and 29482
 
1.6%
mm 26673
 
1.4%
collection 24234
 
1.3%
specimen 23152
 
1.2%
as 22917
 
1.2%
is 22746
 
1.2%
1 22640
 
1.2%
Other values (81634) 1569990
82.6%
2025-01-08T17:57:06.594280image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1609957
 
12.9%
e 933339
 
7.5%
a 635649
 
5.1%
i 628891
 
5.0%
t 615692
 
4.9%
n 591012
 
4.7%
o 588776
 
4.7%
s 465772
 
3.7%
l 461424
 
3.7%
r 452874
 
3.6%
Other values (112) 5498036
44.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7643238
61.2%
Space Separator 1609957
 
12.9%
Decimal Number 1182431
 
9.5%
Uppercase Letter 932071
 
7.5%
Other Punctuation 506745
 
4.1%
Control 419081
 
3.4%
Dash Punctuation 70233
 
0.6%
Open Punctuation 32326
 
0.3%
Close Punctuation 32314
 
0.3%
Math Symbol 27221
 
0.2%
Other values (5) 25805
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 933339
12.2%
a 635649
 
8.3%
i 628891
 
8.2%
t 615692
 
8.1%
n 591012
 
7.7%
o 588776
 
7.7%
s 465772
 
6.1%
l 461424
 
6.0%
r 452874
 
5.9%
d 337046
 
4.4%
Other values (31) 1932763
25.3%
Uppercase Letter
ValueCountFrequency (%)
S 114395
12.3%
T 93129
 
10.0%
N 71458
 
7.7%
C 67099
 
7.2%
R 62596
 
6.7%
O 56902
 
6.1%
E 53897
 
5.8%
L 53620
 
5.8%
A 49927
 
5.4%
M 48274
 
5.2%
Other values (22) 260774
28.0%
Other Punctuation
ValueCountFrequency (%)
. 262249
51.8%
, 80367
 
15.9%
: 61012
 
12.0%
; 45848
 
9.0%
" 20083
 
4.0%
/ 19295
 
3.8%
' 6995
 
1.4%
# 6654
 
1.3%
& 2738
 
0.5%
? 808
 
0.2%
Other values (6) 696
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 204458
17.3%
2 172093
14.6%
0 154696
13.1%
3 116867
9.9%
4 100430
8.5%
9 99868
8.4%
5 94414
8.0%
8 81370
 
6.9%
7 80164
 
6.8%
6 78071
 
6.6%
Math Symbol
ValueCountFrequency (%)
= 25537
93.8%
+ 1644
 
6.0%
~ 31
 
0.1%
< 4
 
< 0.1%
> 4
 
< 0.1%
| 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 32094
99.3%
[ 214
 
0.7%
{ 18
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 32084
99.3%
] 214
 
0.7%
} 16
 
< 0.1%
Control
ValueCountFrequency (%)
417203
99.6%
1878
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 69691
99.2%
542
 
0.8%
Final Punctuation
ValueCountFrequency (%)
35
67.3%
17
32.7%
Space Separator
ValueCountFrequency (%)
1609957
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 25725
100.0%
Initial Punctuation
ValueCountFrequency (%)
23
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 3
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8575309
68.7%
Common 3906113
31.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 933339
 
10.9%
a 635649
 
7.4%
i 628891
 
7.3%
t 615692
 
7.2%
n 591012
 
6.9%
o 588776
 
6.9%
s 465772
 
5.4%
l 461424
 
5.4%
r 452874
 
5.3%
d 337046
 
3.9%
Other values (63) 2864834
33.4%
Common
ValueCountFrequency (%)
1609957
41.2%
417203
 
10.7%
. 262249
 
6.7%
1 204458
 
5.2%
2 172093
 
4.4%
0 154696
 
4.0%
3 116867
 
3.0%
4 100430
 
2.6%
9 99868
 
2.6%
5 94414
 
2.4%
Other values (39) 673878
17.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12479964
> 99.9%
None 841
 
< 0.1%
Punctuation 617
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1609957
 
12.9%
e 933339
 
7.5%
a 635649
 
5.1%
i 628891
 
5.0%
t 615692
 
4.9%
n 591012
 
4.7%
o 588776
 
4.7%
s 465772
 
3.7%
l 461424
 
3.7%
r 452874
 
3.6%
Other values (86) 5496578
44.0%
Punctuation
ValueCountFrequency (%)
542
87.8%
35
 
5.7%
23
 
3.7%
17
 
2.8%
None
ValueCountFrequency (%)
ü 201
23.9%
ã 160
19.0%
è 131
15.6%
å 88
10.5%
é 72
 
8.6%
á 44
 
5.2%
ö 37
 
4.4%
ó 28
 
3.3%
í 21
 
2.5%
ê 21
 
2.5%
Other values (12) 38
 
4.5%

verbatimLabel
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing455209
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:06.644202image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.666666667
Min length6

Characters and Unicode

Total characters20
Distinct characters11
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row-9.3883
2nd row10.6925
3rd row7.0083
ValueCountFrequency (%)
9.3883 1
33.3%
10.6925 1
33.3%
7.0083 1
33.3%
2025-01-08T17:57:06.736385image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 3
15.0%
3 3
15.0%
8 3
15.0%
0 3
15.0%
9 2
10.0%
- 1
 
5.0%
1 1
 
5.0%
6 1
 
5.0%
2 1
 
5.0%
5 1
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16
80.0%
Other Punctuation 3
 
15.0%
Dash Punctuation 1
 
5.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 3
18.8%
8 3
18.8%
0 3
18.8%
9 2
12.5%
1 1
 
6.2%
6 1
 
6.2%
2 1
 
6.2%
5 1
 
6.2%
7 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 20
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 3
15.0%
3 3
15.0%
8 3
15.0%
0 3
15.0%
9 2
10.0%
- 1
 
5.0%
1 1
 
5.0%
6 1
 
5.0%
2 1
 
5.0%
5 1
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 3
15.0%
3 3
15.0%
8 3
15.0%
0 3
15.0%
9 2
10.0%
- 1
 
5.0%
1 1
 
5.0%
6 1
 
5.0%
2 1
 
5.0%
5 1
 
5.0%

materialSampleID
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing455209
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:06.778386image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters21
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row46.2133
2nd row122.563
3rd row158.199
ValueCountFrequency (%)
46.2133 1
33.3%
122.563 1
33.3%
158.199 1
33.3%
2025-01-08T17:57:06.868895image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 4
19.0%
. 3
14.3%
2 3
14.3%
3 3
14.3%
6 2
9.5%
5 2
9.5%
9 2
9.5%
4 1
 
4.8%
8 1
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18
85.7%
Other Punctuation 3
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4
22.2%
2 3
16.7%
3 3
16.7%
6 2
11.1%
5 2
11.1%
9 2
11.1%
4 1
 
5.6%
8 1
 
5.6%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 4
19.0%
. 3
14.3%
2 3
14.3%
3 3
14.3%
6 2
9.5%
5 2
9.5%
9 2
9.5%
4 1
 
4.8%
8 1
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 4
19.0%
. 3
14.3%
2 3
14.3%
3 3
14.3%
6 2
9.5%
5 2
9.5%
9 2
9.5%
4 1
 
4.8%
8 1
 
4.8%

eventID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455211
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:06.909895image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row941.0
ValueCountFrequency (%)
941.0 1
100.0%
2025-01-08T17:57:06.995348image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 1
20.0%
4 1
20.0%
1 1
20.0%
. 1
20.0%
0 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
80.0%
Other Punctuation 1
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 1
25.0%
4 1
25.0%
1 1
25.0%
0 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 1
20.0%
4 1
20.0%
1 1
20.0%
. 1
20.0%
0 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 1
20.0%
4 1
20.0%
1 1
20.0%
. 1
20.0%
0 1
20.0%

fieldNumber
Text

Missing 

Distinct25291
Distinct (%)14.0%
Missing274211
Missing (%)60.2%
Memory size3.5 MiB
2025-01-08T17:57:07.170440image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length149
Median length70
Mean length10.07364048
Min length1

Characters and Unicode

Total characters1823339
Distinct characters82
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10523 ?
Unique (%)5.8%

Sample

1st rowFJS-455
2nd rowM10-97B4 (40-60m)
3rd rowSP 78-18
4th rowBBC 1734 A; M-84
5th rowPHISH-2016-05; SIA-06
ValueCountFrequency (%)
vgs 19290
 
5.7%
jtw 14298
 
4.2%
bbc 6125
 
1.8%
lwk 4274
 
1.3%
lk 4258
 
1.3%
sol 3414
 
1.0%
rpv 3291
 
1.0%
sp 3134
 
0.9%
bayley 2740
 
0.8%
lrp 2643
 
0.8%
Other values (22433) 275090
81.3%
2025-01-08T17:57:07.418080image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 190028
 
10.4%
157556
 
8.6%
1 125554
 
6.9%
0 109950
 
6.0%
2 103376
 
5.7%
9 89324
 
4.9%
6 82127
 
4.5%
7 76841
 
4.2%
3 73150
 
4.0%
8 68610
 
3.8%
Other values (72) 746823
41.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 855884
46.9%
Uppercase Letter 548142
30.1%
Dash Punctuation 190028
 
10.4%
Space Separator 157556
 
8.6%
Other Punctuation 39951
 
2.2%
Lowercase Letter 29723
 
1.6%
Close Punctuation 995
 
0.1%
Open Punctuation 994
 
0.1%
Math Symbol 62
 
< 0.1%
Final Punctuation 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 54493
 
9.9%
A 45928
 
8.4%
V 35510
 
6.5%
L 34668
 
6.3%
W 32879
 
6.0%
T 31271
 
5.7%
B 30330
 
5.5%
G 27885
 
5.1%
C 27370
 
5.0%
M 26822
 
4.9%
Other values (16) 200986
36.7%
Lowercase Letter
ValueCountFrequency (%)
o 5688
19.1%
t 3332
11.2%
e 3308
11.1%
a 3141
10.6%
r 2575
8.7%
i 1867
 
6.3%
n 1554
 
5.2%
u 1507
 
5.1%
m 1427
 
4.8%
l 1415
 
4.8%
Other values (15) 3909
13.2%
Other Punctuation
ValueCountFrequency (%)
; 32939
82.4%
. 4326
 
10.8%
, 1177
 
2.9%
# 917
 
2.3%
: 267
 
0.7%
/ 177
 
0.4%
& 60
 
0.2%
' 43
 
0.1%
? 31
 
0.1%
" 8
 
< 0.1%
Other values (2) 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 125554
14.7%
0 109950
12.8%
2 103376
12.1%
9 89324
10.4%
6 82127
9.6%
7 76841
9.0%
3 73150
8.5%
8 68610
8.0%
4 65361
7.6%
5 61591
7.2%
Close Punctuation
ValueCountFrequency (%)
) 992
99.7%
] 3
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 991
99.7%
[ 3
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 190028
100.0%
Space Separator
ValueCountFrequency (%)
157556
100.0%
Math Symbol
ValueCountFrequency (%)
+ 62
100.0%
Final Punctuation
ValueCountFrequency (%)
3
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1245474
68.3%
Latin 577865
31.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 54493
 
9.4%
A 45928
 
7.9%
V 35510
 
6.1%
L 34668
 
6.0%
W 32879
 
5.7%
T 31271
 
5.4%
B 30330
 
5.2%
G 27885
 
4.8%
C 27370
 
4.7%
M 26822
 
4.6%
Other values (41) 230709
39.9%
Common
ValueCountFrequency (%)
- 190028
15.3%
157556
12.7%
1 125554
10.1%
0 109950
8.8%
2 103376
8.3%
9 89324
7.2%
6 82127
6.6%
7 76841
6.2%
3 73150
 
5.9%
8 68610
 
5.5%
Other values (21) 168958
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1823336
> 99.9%
Punctuation 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 190028
 
10.4%
157556
 
8.6%
1 125554
 
6.9%
0 109950
 
6.0%
2 103376
 
5.7%
9 89324
 
4.9%
6 82127
 
4.5%
7 76841
 
4.2%
3 73150
 
4.0%
8 68610
 
3.8%
Other values (71) 746820
41.0%
Punctuation
ValueCountFrequency (%)
3
100.0%

eventDate
Text

Missing 

Distinct30514
Distinct (%)7.7%
Missing60241
Missing (%)13.2%
Memory size3.5 MiB
2025-01-08T17:57:07.604183image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length10.09004712
Min length4

Characters and Unicode

Total characters3985276
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8111 ?
Unique (%)2.1%

Sample

1st row1938-03-25
2nd row1956-05-30
3rd row1997-05-10
4th row1978-05-22
5th row1928-02-10
ValueCountFrequency (%)
1906 1477
 
0.4%
1902 1141
 
0.3%
1888 1112
 
0.3%
1889 938
 
0.2%
1994-05-06 927
 
0.2%
1994-04-30 702
 
0.2%
1901 595
 
0.2%
1880 568
 
0.1%
1970-09-11/1970-09-16 510
 
0.1%
1893 440
 
0.1%
Other values (30504) 386561
97.9%
2025-01-08T17:57:07.853403image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 773124
19.4%
1 729306
18.3%
0 645975
16.2%
9 528070
13.3%
2 287210
 
7.2%
8 212491
 
5.3%
6 183136
 
4.6%
7 181671
 
4.6%
5 151243
 
3.8%
3 141325
 
3.5%
Other values (2) 151725
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3194948
80.2%
Dash Punctuation 773124
 
19.4%
Other Punctuation 17204
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 729306
22.8%
0 645975
20.2%
9 528070
16.5%
2 287210
 
9.0%
8 212491
 
6.7%
6 183136
 
5.7%
7 181671
 
5.7%
5 151243
 
4.7%
3 141325
 
4.4%
4 134521
 
4.2%
Dash Punctuation
ValueCountFrequency (%)
- 773124
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 17204
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3985276
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 773124
19.4%
1 729306
18.3%
0 645975
16.2%
9 528070
13.3%
2 287210
 
7.2%
8 212491
 
5.3%
6 183136
 
4.6%
7 181671
 
4.6%
5 151243
 
3.8%
3 141325
 
3.5%
Other values (2) 151725
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3985276
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 773124
19.4%
1 729306
18.3%
0 645975
16.2%
9 528070
13.3%
2 287210
 
7.2%
8 212491
 
5.3%
6 183136
 
4.6%
7 181671
 
4.6%
5 151243
 
3.8%
3 141325
 
3.5%
Other values (2) 151725
 
3.8%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing91500
Missing (%)20.1%
Memory size3.5 MiB
2025-01-08T17:57:08.050910image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.766785259
Min length1

Characters and Unicode

Total characters1006313
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row84
2nd row151
3rd row130
4th row142
5th row41
ValueCountFrequency (%)
126 2409
 
0.7%
251 1989
 
0.5%
120 1928
 
0.5%
117 1868
 
0.5%
146 1854
 
0.5%
159 1853
 
0.5%
143 1786
 
0.5%
161 1783
 
0.5%
141 1777
 
0.5%
154 1775
 
0.5%
Other values (356) 344690
94.8%
2025-01-08T17:57:08.303071image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 204382
20.3%
2 182397
18.1%
3 128618
12.8%
5 80887
 
8.0%
4 77962
 
7.7%
6 76302
 
7.6%
0 67269
 
6.7%
7 65487
 
6.5%
9 63356
 
6.3%
8 59653
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1006313
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 204382
20.3%
2 182397
18.1%
3 128618
12.8%
5 80887
 
8.0%
4 77962
 
7.7%
6 76302
 
7.6%
0 67269
 
6.7%
7 65487
 
6.5%
9 63356
 
6.3%
8 59653
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1006313
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 204382
20.3%
2 182397
18.1%
3 128618
12.8%
5 80887
 
8.0%
4 77962
 
7.7%
6 76302
 
7.6%
0 67269
 
6.7%
7 65487
 
6.5%
9 63356
 
6.3%
8 59653
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1006313
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 204382
20.3%
2 182397
18.1%
3 128618
12.8%
5 80887
 
8.0%
4 77962
 
7.7%
6 76302
 
7.6%
0 67269
 
6.7%
7 65487
 
6.5%
9 63356
 
6.3%
8 59653
 
5.9%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing91500
Missing (%)20.1%
Memory size3.5 MiB
2025-01-08T17:57:08.496109image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.768014253
Min length1

Characters and Unicode

Total characters1006760
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row84
2nd row151
3rd row130
4th row142
5th row41
ValueCountFrequency (%)
126 2497
 
0.7%
251 1977
 
0.5%
120 1908
 
0.5%
117 1873
 
0.5%
161 1803
 
0.5%
116 1783
 
0.5%
141 1778
 
0.5%
159 1755
 
0.5%
146 1754
 
0.5%
145 1748
 
0.5%
Other values (356) 344836
94.8%
2025-01-08T17:57:08.754777image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 204451
20.3%
2 182936
18.2%
3 128905
12.8%
5 81170
 
8.1%
4 77196
 
7.7%
6 75877
 
7.5%
0 66802
 
6.6%
7 65577
 
6.5%
9 63938
 
6.4%
8 59908
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1006760
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 204451
20.3%
2 182936
18.2%
3 128905
12.8%
5 81170
 
8.1%
4 77196
 
7.7%
6 75877
 
7.5%
0 66802
 
6.6%
7 65577
 
6.5%
9 63938
 
6.4%
8 59908
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1006760
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 204451
20.3%
2 182936
18.2%
3 128905
12.8%
5 81170
 
8.1%
4 77196
 
7.7%
6 75877
 
7.5%
0 66802
 
6.6%
7 65577
 
6.5%
9 63938
 
6.4%
8 59908
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1006760
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 204451
20.3%
2 182936
18.2%
3 128905
12.8%
5 81170
 
8.1%
4 77196
 
7.7%
6 75877
 
7.5%
0 66802
 
6.6%
7 65577
 
6.5%
9 63938
 
6.4%
8 59908
 
6.0%

year
Text

Missing 

Distinct191
Distinct (%)< 0.1%
Missing60500
Missing (%)13.3%
Memory size3.5 MiB
2025-01-08T17:57:08.929378image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1578848
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st row1938
2nd row1956
3rd row1997
4th row1978
5th row1928
ValueCountFrequency (%)
1909 17685
 
4.5%
1908 13638
 
3.5%
1970 11215
 
2.8%
1969 10429
 
2.6%
1964 9054
 
2.3%
1978 8877
 
2.2%
1967 8372
 
2.1%
1979 7846
 
2.0%
1971 7828
 
2.0%
1968 7271
 
1.8%
Other values (181) 292497
74.1%
2025-01-08T17:57:09.149256image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 431944
27.4%
1 411052
26.0%
0 149894
 
9.5%
8 131390
 
8.3%
7 103219
 
6.5%
6 100594
 
6.4%
2 83679
 
5.3%
5 59572
 
3.8%
4 57248
 
3.6%
3 50256
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1578848
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 431944
27.4%
1 411052
26.0%
0 149894
 
9.5%
8 131390
 
8.3%
7 103219
 
6.5%
6 100594
 
6.4%
2 83679
 
5.3%
5 59572
 
3.8%
4 57248
 
3.6%
3 50256
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Common 1578848
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 431944
27.4%
1 411052
26.0%
0 149894
 
9.5%
8 131390
 
8.3%
7 103219
 
6.5%
6 100594
 
6.4%
2 83679
 
5.3%
5 59572
 
3.8%
4 57248
 
3.6%
3 50256
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1578848
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 431944
27.4%
1 411052
26.0%
0 149894
 
9.5%
8 131390
 
8.3%
7 103219
 
6.5%
6 100594
 
6.4%
2 83679
 
5.3%
5 59572
 
3.8%
4 57248
 
3.6%
3 50256
 
3.2%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing82757
Missing (%)18.2%
Memory size3.5 MiB
2025-01-08T17:57:09.208882image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.196200883
Min length1

Characters and Unicode

Total characters445531
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row5
3rd row5
4th row5
5th row2
ValueCountFrequency (%)
5 47294
12.7%
6 37778
10.1%
9 37734
10.1%
8 35582
9.6%
4 34250
9.2%
7 33265
8.9%
3 32600
8.8%
11 31410
8.4%
10 23798
6.4%
2 22626
6.1%
Other values (2) 36118
9.7%
2025-01-08T17:57:09.308479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 122736
27.5%
5 47294
 
10.6%
2 40494
 
9.1%
6 37778
 
8.5%
9 37734
 
8.5%
8 35582
 
8.0%
4 34250
 
7.7%
7 33265
 
7.5%
3 32600
 
7.3%
0 23798
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 445531
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 122736
27.5%
5 47294
 
10.6%
2 40494
 
9.1%
6 37778
 
8.5%
9 37734
 
8.5%
8 35582
 
8.0%
4 34250
 
7.7%
7 33265
 
7.5%
3 32600
 
7.3%
0 23798
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Common 445531
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 122736
27.5%
5 47294
 
10.6%
2 40494
 
9.1%
6 37778
 
8.5%
9 37734
 
8.5%
8 35582
 
8.0%
4 34250
 
7.7%
7 33265
 
7.5%
3 32600
 
7.3%
0 23798
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 445531
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 122736
27.5%
5 47294
 
10.6%
2 40494
 
9.1%
6 37778
 
8.5%
9 37734
 
8.5%
8 35582
 
8.0%
4 34250
 
7.7%
7 33265
 
7.5%
3 32600
 
7.3%
0 23798
 
5.3%

day
Text

Missing 

Distinct32
Distinct (%)< 0.1%
Missing108703
Missing (%)23.9%
Memory size3.5 MiB
2025-01-08T17:57:09.384961image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length96
Median length2
Mean length1.693840564
Min length1

Characters and Unicode

Total characters586931
Distinct characters40
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row25
2nd row30
3rd row10
4th row22
5th row10
ValueCountFrequency (%)
8 13276
 
3.8%
15 12432
 
3.6%
5 12265
 
3.5%
23 12122
 
3.5%
7 12104
 
3.5%
6 12098
 
3.5%
3 11850
 
3.4%
14 11823
 
3.4%
16 11810
 
3.4%
11 11460
 
3.3%
Other values (35) 225282
65.0%
2025-01-08T17:57:09.521607image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 153101
26.1%
2 144165
24.6%
3 50450
 
8.6%
5 35497
 
6.0%
6 35148
 
6.0%
8 35145
 
6.0%
4 34647
 
5.9%
7 34166
 
5.8%
9 32496
 
5.5%
0 32024
 
5.5%
Other values (30) 92
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 586839
> 99.9%
Lowercase Letter 64
 
< 0.1%
Space Separator 13
 
< 0.1%
Uppercase Letter 9
 
< 0.1%
Other Punctuation 4
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11
17.2%
o 7
10.9%
r 7
10.9%
a 5
7.8%
c 4
 
6.2%
i 4
 
6.2%
t 4
 
6.2%
n 4
 
6.2%
s 3
 
4.7%
d 3
 
4.7%
Other values (9) 12
18.8%
Decimal Number
ValueCountFrequency (%)
1 153101
26.1%
2 144165
24.6%
3 50450
 
8.6%
5 35497
 
6.0%
6 35148
 
6.0%
8 35145
 
6.0%
4 34647
 
5.9%
7 34166
 
5.8%
9 32496
 
5.5%
0 32024
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
G 3
33.3%
P 2
22.2%
B 1
 
11.1%
C 1
 
11.1%
W 1
 
11.1%
E 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
. 3
75.0%
, 1
 
25.0%
Space Separator
ValueCountFrequency (%)
13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 586858
> 99.9%
Latin 73
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11
15.1%
o 7
 
9.6%
r 7
 
9.6%
a 5
 
6.8%
c 4
 
5.5%
i 4
 
5.5%
t 4
 
5.5%
n 4
 
5.5%
s 3
 
4.1%
d 3
 
4.1%
Other values (15) 21
28.8%
Common
ValueCountFrequency (%)
1 153101
26.1%
2 144165
24.6%
3 50450
 
8.6%
5 35497
 
6.0%
6 35148
 
6.0%
8 35145
 
6.0%
4 34647
 
5.9%
7 34166
 
5.8%
9 32496
 
5.5%
0 32024
 
5.5%
Other values (5) 19
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 586931
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 153101
26.1%
2 144165
24.6%
3 50450
 
8.6%
5 35497
 
6.0%
6 35148
 
6.0%
8 35145
 
6.0%
4 34647
 
5.9%
7 34166
 
5.8%
9 32496
 
5.5%
0 32024
 
5.5%
Other values (30) 92
 
< 0.1%

verbatimEventDate
Text

Missing 

Distinct34047
Distinct (%)9.4%
Missing92472
Missing (%)20.3%
Memory size3.5 MiB
2025-01-08T17:57:09.688674image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length102
Median length98
Mean length26.64553123
Min length2

Characters and Unicode

Total characters9665400
Distinct characters71
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10960 ?
Unique (%)3.0%

Sample

1st row0000 00 00 - 0000 00 00
2nd row1938 Mar 25 - 0000 00 00
3rd row0000 00 00 - 0000 00 00
4th row1956 May 30 - 0000 00 00
5th row0000 00 00 - 0000 00 00
ValueCountFrequency (%)
00 796795
29.5%
420492
15.6%
0000 373796
13.9%
may 36088
 
1.3%
jun 32030
 
1.2%
sep 31699
 
1.2%
aug 30346
 
1.1%
apr 29080
 
1.1%
jul 27007
 
1.0%
mar 26144
 
1.0%
Other values (2486) 893609
33.1%
2025-01-08T17:57:09.929583image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3533769
36.6%
2334346
24.2%
1 644236
 
6.7%
9 432298
 
4.5%
- 426014
 
4.4%
2 223234
 
2.3%
: 175106
 
1.8%
8 153818
 
1.6%
3 151315
 
1.6%
5 144564
 
1.5%
Other values (61) 1446700
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5678799
58.8%
Space Separator 2334346
24.2%
Lowercase Letter 643687
 
6.7%
Dash Punctuation 426016
 
4.4%
Uppercase Letter 314140
 
3.3%
Other Punctuation 267700
 
2.8%
Open Punctuation 326
 
< 0.1%
Close Punctuation 326
 
< 0.1%
Math Symbol 60
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 91605
14.2%
a 81376
12.6%
e 70054
10.9%
p 62172
9.7%
r 58263
9.1%
n 50110
7.8%
y 36475
 
5.7%
c 34893
 
5.4%
g 30933
 
4.8%
l 28217
 
4.4%
Other values (14) 99589
15.5%
Uppercase Letter
ValueCountFrequency (%)
J 75515
24.0%
M 63514
20.2%
A 61266
19.5%
S 31961
10.2%
N 25248
 
8.0%
F 19093
 
6.1%
O 17673
 
5.6%
D 16283
 
5.2%
H 1232
 
0.4%
T 729
 
0.2%
Other values (13) 1626
 
0.5%
Decimal Number
ValueCountFrequency (%)
0 3533769
62.2%
1 644236
 
11.3%
9 432298
 
7.6%
2 223234
 
3.9%
8 153818
 
2.7%
3 151315
 
2.7%
5 144564
 
2.5%
6 135744
 
2.4%
7 135695
 
2.4%
4 124126
 
2.2%
Other Punctuation
ValueCountFrequency (%)
: 175106
65.4%
; 89844
33.6%
, 1348
 
0.5%
. 1124
 
0.4%
/ 273
 
0.1%
? 5
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 426014
> 99.9%
2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 325
99.7%
[ 1
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 325
99.7%
] 1
 
0.3%
Space Separator
ValueCountFrequency (%)
2334346
100.0%
Math Symbol
ValueCountFrequency (%)
+ 60
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8707573
90.1%
Latin 957827
 
9.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 91605
 
9.6%
a 81376
 
8.5%
J 75515
 
7.9%
e 70054
 
7.3%
M 63514
 
6.6%
p 62172
 
6.5%
A 61266
 
6.4%
r 58263
 
6.1%
n 50110
 
5.2%
y 36475
 
3.8%
Other values (37) 307477
32.1%
Common
ValueCountFrequency (%)
0 3533769
40.6%
2334346
26.8%
1 644236
 
7.4%
9 432298
 
5.0%
- 426014
 
4.9%
2 223234
 
2.6%
: 175106
 
2.0%
8 153818
 
1.8%
3 151315
 
1.7%
5 144564
 
1.7%
Other values (14) 488873
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9665398
> 99.9%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3533769
36.6%
2334346
24.2%
1 644236
 
6.7%
9 432298
 
4.5%
- 426014
 
4.4%
2 223234
 
2.3%
: 175106
 
1.8%
8 153818
 
1.6%
3 151315
 
1.6%
5 144564
 
1.5%
Other values (60) 1446698
15.0%
Punctuation
ValueCountFrequency (%)
2
100.0%

locationID
Text

Missing 

Distinct16404
Distinct (%)15.9%
Missing352012
Missing (%)77.3%
Memory size3.5 MiB
2025-01-08T17:57:10.106259image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length68
Median length40
Mean length5.144757752
Min length1

Characters and Unicode

Total characters530939
Distinct characters75
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6089 ?
Unique (%)5.9%

Sample

1st rowM10-97B4 (4
2nd row4-31N
3rd row5627
4th row308
5th rowB12 TR4
ValueCountFrequency (%)
d 13062
 
9.8%
tc 3543
 
2.7%
haul 1244
 
0.9%
trans 1038
 
0.8%
1 918
 
0.7%
2 894
 
0.7%
tt 799
 
0.6%
4 661
 
0.5%
3 655
 
0.5%
5 629
 
0.5%
Other values (13796) 109250
82.3%
2025-01-08T17:57:10.350120image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 61871
 
11.7%
2 49289
 
9.3%
- 37677
 
7.1%
3 36373
 
6.9%
4 36203
 
6.8%
5 34152
 
6.4%
0 31687
 
6.0%
29493
 
5.6%
7 29220
 
5.5%
6 27484
 
5.2%
Other values (65) 157490
29.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 348695
65.7%
Uppercase Letter 95277
 
17.9%
Dash Punctuation 37677
 
7.1%
Space Separator 29493
 
5.6%
Other Punctuation 11083
 
2.1%
Lowercase Letter 7547
 
1.4%
Open Punctuation 805
 
0.2%
Close Punctuation 350
 
0.1%
Math Symbol 12
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 15184
15.9%
A 11061
11.6%
T 9956
10.4%
C 7283
 
7.6%
N 6625
 
7.0%
S 5565
 
5.8%
M 4811
 
5.0%
B 4690
 
4.9%
E 4552
 
4.8%
P 4397
 
4.6%
Other values (16) 21153
22.2%
Lowercase Letter
ValueCountFrequency (%)
u 1395
18.5%
a 1021
13.5%
r 854
11.3%
i 678
9.0%
m 582
7.7%
q 576
7.6%
o 499
 
6.6%
n 332
 
4.4%
t 315
 
4.2%
e 236
 
3.1%
Other values (13) 1059
14.0%
Decimal Number
ValueCountFrequency (%)
1 61871
17.7%
2 49289
14.1%
3 36373
10.4%
4 36203
10.4%
5 34152
9.8%
0 31687
9.1%
7 29220
8.4%
6 27484
7.9%
8 21650
 
6.2%
9 20766
 
6.0%
Other Punctuation
ValueCountFrequency (%)
. 7463
67.3%
; 1088
 
9.8%
/ 1054
 
9.5%
, 808
 
7.3%
& 269
 
2.4%
? 185
 
1.7%
# 112
 
1.0%
: 101
 
0.9%
' 2
 
< 0.1%
" 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 9
75.0%
= 3
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 37677
100.0%
Space Separator
ValueCountFrequency (%)
29493
100.0%
Open Punctuation
ValueCountFrequency (%)
( 805
100.0%
Close Punctuation
ValueCountFrequency (%)
) 350
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 428115
80.6%
Latin 102824
 
19.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 15184
14.8%
A 11061
 
10.8%
T 9956
 
9.7%
C 7283
 
7.1%
N 6625
 
6.4%
S 5565
 
5.4%
M 4811
 
4.7%
B 4690
 
4.6%
E 4552
 
4.4%
P 4397
 
4.3%
Other values (39) 28700
27.9%
Common
ValueCountFrequency (%)
1 61871
14.5%
2 49289
11.5%
- 37677
8.8%
3 36373
8.5%
4 36203
8.5%
5 34152
8.0%
0 31687
7.4%
29493
6.9%
7 29220
6.8%
6 27484
6.4%
Other values (16) 54666
12.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 530939
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 61871
 
11.7%
2 49289
 
9.3%
- 37677
 
7.1%
3 36373
 
6.9%
4 36203
 
6.8%
5 34152
 
6.4%
0 31687
 
6.0%
29493
 
5.6%
7 29220
 
5.5%
6 27484
 
5.2%
Other values (65) 157490
29.7%

higherGeography
Text

Missing 

Distinct13756
Distinct (%)3.2%
Missing20492
Missing (%)4.5%
Memory size3.5 MiB
2025-01-08T17:57:10.536330image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length177
Median length131
Mean length59.33840633
Min length4

Characters and Unicode

Total characters25795592
Distinct characters123
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3844 ?
Unique (%)0.9%

Sample

1st rowNorth Pacific Ocean, United States, Hawaii, Hawaiian Islands
2nd rowNorth Atlantic Ocean, Gulf of Mexico, United States, Florida, Hillsborough County
3rd rowNorth Pacific Ocean, Japan, Tokyo Prefecture, Japanese Archipelago, Honshu
4th rowNorth America, United States, West Virginia, Randolph County
5th rowAtlantic, Caribbean Sea, Barbados, Lesser Antilles, Barbados
ValueCountFrequency (%)
ocean 297628
 
8.7%
north 281556
 
8.2%
pacific 178084
 
5.2%
united 125556
 
3.7%
states 125313
 
3.7%
islands 124596
 
3.6%
atlantic 113798
 
3.3%
south 106956
 
3.1%
america 96813
 
2.8%
county 72814
 
2.1%
Other values (6594) 1896684
55.5%
2025-01-08T17:57:10.800675image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2985078
 
11.6%
a 2677647
 
10.4%
i 1842529
 
7.1%
n 1669001
 
6.5%
e 1623420
 
6.3%
t 1398934
 
5.4%
, 1342963
 
5.2%
o 1188502
 
4.6%
c 1128651
 
4.4%
r 1085797
 
4.2%
Other values (113) 8853070
34.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18063036
70.0%
Uppercase Letter 3373888
 
13.1%
Space Separator 2985078
 
11.6%
Other Punctuation 1353273
 
5.2%
Open Punctuation 6855
 
< 0.1%
Close Punctuation 6855
 
< 0.1%
Dash Punctuation 6542
 
< 0.1%
Format 30
 
< 0.1%
Decimal Number 28
 
< 0.1%
Modifier Letter 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2677647
14.8%
i 1842529
10.2%
n 1669001
9.2%
e 1623420
9.0%
t 1398934
 
7.7%
o 1188502
 
6.6%
c 1128651
 
6.2%
r 1085797
 
6.0%
l 929600
 
5.1%
s 877851
 
4.9%
Other values (57) 3641104
20.2%
Uppercase Letter
ValueCountFrequency (%)
S 417855
12.4%
P 393221
11.7%
A 382224
11.3%
N 334378
9.9%
O 326042
9.7%
C 239987
7.1%
I 236820
 
7.0%
M 157026
 
4.7%
B 155806
 
4.6%
U 132231
 
3.9%
Other values (26) 598298
17.7%
Other Punctuation
ValueCountFrequency (%)
, 1342963
99.2%
' 5377
 
0.4%
. 3635
 
0.3%
; 1110
 
0.1%
/ 108
 
< 0.1%
: 80
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 13
46.4%
0 7
25.0%
3 4
 
14.3%
1 3
 
10.7%
4 1
 
3.6%
Dash Punctuation
ValueCountFrequency (%)
- 6358
97.2%
184
 
2.8%
Open Punctuation
ValueCountFrequency (%)
( 5690
83.0%
[ 1165
 
17.0%
Close Punctuation
ValueCountFrequency (%)
) 5690
83.0%
] 1165
 
17.0%
Space Separator
ValueCountFrequency (%)
2985078
100.0%
Format
ValueCountFrequency (%)
30
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21436924
83.1%
Common 4358668
 
16.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2677647
 
12.5%
i 1842529
 
8.6%
n 1669001
 
7.8%
e 1623420
 
7.6%
t 1398934
 
6.5%
o 1188502
 
5.5%
c 1128651
 
5.3%
r 1085797
 
5.1%
l 929600
 
4.3%
s 877851
 
4.1%
Other values (93) 7014992
32.7%
Common
ValueCountFrequency (%)
2985078
68.5%
, 1342963
30.8%
- 6358
 
0.1%
( 5690
 
0.1%
) 5690
 
0.1%
' 5377
 
0.1%
. 3635
 
0.1%
] 1165
 
< 0.1%
[ 1165
 
< 0.1%
; 1110
 
< 0.1%
Other values (10) 437
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25787588
> 99.9%
None 7726
 
< 0.1%
Punctuation 214
 
< 0.1%
Latin Ext Additional 57
 
< 0.1%
Modifier Letters 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2985078
 
11.6%
a 2677647
 
10.4%
i 1842529
 
7.1%
n 1669001
 
6.5%
e 1623420
 
6.3%
t 1398934
 
5.4%
, 1342963
 
5.2%
o 1188502
 
4.6%
c 1128651
 
4.4%
r 1085797
 
4.2%
Other values (59) 8845066
34.3%
None
ValueCountFrequency (%)
ó 2695
34.9%
á 2628
34.0%
í 1137
14.7%
é 385
 
5.0%
ñ 174
 
2.3%
ã 109
 
1.4%
ú 107
 
1.4%
Ō 78
 
1.0%
Î 59
 
0.8%
Ø 51
 
0.7%
Other values (31) 303
 
3.9%
Punctuation
ValueCountFrequency (%)
184
86.0%
30
 
14.0%
Latin Ext Additional
ValueCountFrequency (%)
15
26.3%
11
19.3%
6
 
10.5%
ế 6
 
10.5%
5
 
8.8%
4
 
7.0%
4
 
7.0%
4
 
7.0%
1
 
1.8%
1
 
1.8%
Modifier Letters
ValueCountFrequency (%)
ʻ 7
100.0%

continent
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing162647
Missing (%)35.7%
Memory size3.5 MiB
2025-01-08T17:57:10.995714image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length8.959157794
Min length4

Characters and Unicode

Total characters2621136
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowASIA
4th rowAFRICA
5th rowOCEANIA
ValueCountFrequency (%)
north_america 101424
34.7%
asia 73673
25.2%
oceania 62827
21.5%
south_america 34099
 
11.7%
africa 17795
 
6.1%
europe 2346
 
0.8%
antarctica 401
 
0.1%
2025-01-08T17:57:11.100089image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 580839
22.2%
I 290219
11.1%
R 257489
9.8%
C 216947
 
8.3%
E 203042
 
7.7%
O 200696
 
7.7%
N 164652
 
6.3%
T 136325
 
5.2%
H 135523
 
5.2%
_ 135523
 
5.2%
Other values (5) 299881
11.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2485613
94.8%
Connector Punctuation 135523
 
5.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 580839
23.4%
I 290219
11.7%
R 257489
10.4%
C 216947
 
8.7%
E 203042
 
8.2%
O 200696
 
8.1%
N 164652
 
6.6%
T 136325
 
5.5%
H 135523
 
5.5%
M 135523
 
5.5%
Other values (4) 164358
 
6.6%
Connector Punctuation
ValueCountFrequency (%)
_ 135523
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2485613
94.8%
Common 135523
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 580839
23.4%
I 290219
11.7%
R 257489
10.4%
C 216947
 
8.7%
E 203042
 
8.2%
O 200696
 
8.1%
N 164652
 
6.6%
T 136325
 
5.5%
H 135523
 
5.5%
M 135523
 
5.5%
Other values (4) 164358
 
6.6%
Common
ValueCountFrequency (%)
_ 135523
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2621136
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 580839
22.2%
I 290219
11.1%
R 257489
9.8%
C 216947
 
8.3%
E 203042
 
7.7%
O 200696
 
7.7%
N 164652
 
6.3%
T 136325
 
5.2%
H 135523
 
5.2%
_ 135523
 
5.2%
Other values (5) 299881
11.4%

waterBody
Text

Missing 

Distinct1776
Distinct (%)0.6%
Missing133275
Missing (%)29.3%
Memory size3.5 MiB
2025-01-08T17:57:11.274705image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length72
Median length71
Mean length24.05968559
Min length6

Characters and Unicode

Total characters7745703
Distinct characters68
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique489 ?
Unique (%)0.2%

Sample

1st rowNorth Pacific Ocean
2nd rowNorth Atlantic Ocean, Gulf of Mexico
3rd rowNorth Pacific Ocean
4th rowAtlantic, Caribbean Sea
5th rowNorth Pacific Ocean
ValueCountFrequency (%)
ocean 296315
24.5%
north 200693
16.6%
pacific 178071
14.7%
atlantic 113701
 
9.4%
south 68065
 
5.6%
sea 63584
 
5.3%
of 34822
 
2.9%
gulf 34750
 
2.9%
bay 30113
 
2.5%
indian 28800
 
2.4%
Other values (1364) 159134
13.2%
2025-01-08T17:57:11.522222image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
886111
11.4%
a 885823
11.4%
c 796020
 
10.3%
i 598543
 
7.7%
n 555364
 
7.2%
t 522462
 
6.7%
e 465166
 
6.0%
o 359377
 
4.6%
O 297236
 
3.8%
h 288188
 
3.7%
Other values (58) 2091413
27.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5546317
71.6%
Uppercase Letter 1172272
 
15.1%
Space Separator 886111
 
11.4%
Other Punctuation 139267
 
1.8%
Dash Punctuation 1592
 
< 0.1%
Open Punctuation 72
 
< 0.1%
Close Punctuation 72
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 885823
16.0%
c 796020
14.4%
i 598543
10.8%
n 555364
10.0%
t 522462
9.4%
e 465166
8.4%
o 359377
6.5%
h 288188
 
5.2%
r 272734
 
4.9%
f 249485
 
4.5%
Other values (22) 553155
10.0%
Uppercase Letter
ValueCountFrequency (%)
O 297236
25.4%
N 202063
17.2%
P 187410
16.0%
S 148088
12.6%
A 121001
10.3%
C 45565
 
3.9%
B 39829
 
3.4%
G 37597
 
3.2%
M 31550
 
2.7%
I 31014
 
2.6%
Other values (16) 30919
 
2.6%
Other Punctuation
ValueCountFrequency (%)
, 137355
98.6%
; 1110
 
0.8%
' 517
 
0.4%
. 150
 
0.1%
: 80
 
0.1%
/ 55
 
< 0.1%
Space Separator
ValueCountFrequency (%)
886111
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1592
100.0%
Open Punctuation
ValueCountFrequency (%)
( 72
100.0%
Close Punctuation
ValueCountFrequency (%)
) 72
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6718589
86.7%
Common 1027114
 
13.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 885823
13.2%
c 796020
11.8%
i 598543
 
8.9%
n 555364
 
8.3%
t 522462
 
7.8%
e 465166
 
6.9%
o 359377
 
5.3%
O 297236
 
4.4%
h 288188
 
4.3%
r 272734
 
4.1%
Other values (48) 1677676
25.0%
Common
ValueCountFrequency (%)
886111
86.3%
, 137355
 
13.4%
- 1592
 
0.2%
; 1110
 
0.1%
' 517
 
0.1%
. 150
 
< 0.1%
: 80
 
< 0.1%
( 72
 
< 0.1%
) 72
 
< 0.1%
/ 55
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7745257
> 99.9%
None 446
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
886111
11.4%
a 885823
11.4%
c 796020
 
10.3%
i 598543
 
7.7%
n 555364
 
7.2%
t 522462
 
6.7%
e 465166
 
6.0%
o 359377
 
4.6%
O 297236
 
3.8%
h 288188
 
3.7%
Other values (51) 2090967
27.0%
None
ValueCountFrequency (%)
í 171
38.3%
á 95
21.3%
ñ 68
 
15.2%
é 59
 
13.2%
ó 38
 
8.5%
è 13
 
2.9%
É 2
 
0.4%

islandGroup
Text

Missing 

Distinct323
Distinct (%)0.5%
Missing390811
Missing (%)85.9%
Memory size3.5 MiB
2025-01-08T17:57:11.703065image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length32
Mean length14.81478548
Min length4

Characters and Unicode

Total characters954087
Distinct characters64
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)0.1%

Sample

1st rowFlorida Islands
2nd rowVava'u Group
3rd rowVisayas
4th rowCuyo Islands
5th rowHa'apai Group
ValueCountFrequency (%)
islands 31617
22.3%
group 13997
 
9.9%
chain 5485
 
3.9%
visayas 4942
 
3.5%
leeward 4824
 
3.4%
ralik 4613
 
3.3%
bahama 2866
 
2.0%
island 2805
 
2.0%
cruz 2205
 
1.6%
santa 2205
 
1.6%
Other values (354) 66278
46.7%
2025-01-08T17:57:11.937068image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 143525
15.0%
s 92363
 
9.7%
77436
 
8.1%
n 71038
 
7.4%
l 57390
 
6.0%
d 51466
 
5.4%
r 46473
 
4.9%
u 38564
 
4.0%
o 37605
 
3.9%
i 37528
 
3.9%
Other values (54) 300699
31.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 728655
76.4%
Uppercase Letter 142911
 
15.0%
Space Separator 77436
 
8.1%
Open Punctuation 1946
 
0.2%
Close Punctuation 1946
 
0.2%
Other Punctuation 1144
 
0.1%
Format 30
 
< 0.1%
Dash Punctuation 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 143525
19.7%
s 92363
12.7%
n 71038
9.7%
l 57390
 
7.9%
d 51466
 
7.1%
r 46473
 
6.4%
u 38564
 
5.3%
o 37605
 
5.2%
i 37528
 
5.2%
e 33860
 
4.6%
Other values (20) 118843
16.3%
Uppercase Letter
ValueCountFrequency (%)
I 34888
24.4%
C 16633
11.6%
G 16301
11.4%
B 11324
 
7.9%
S 9420
 
6.6%
L 8923
 
6.2%
R 8182
 
5.7%
V 7087
 
5.0%
T 5921
 
4.1%
A 5185
 
3.6%
Other values (16) 19047
13.3%
Open Punctuation
ValueCountFrequency (%)
( 1764
90.6%
[ 182
 
9.4%
Close Punctuation
ValueCountFrequency (%)
) 1764
90.6%
] 182
 
9.4%
Space Separator
ValueCountFrequency (%)
77436
100.0%
Other Punctuation
ValueCountFrequency (%)
' 1144
100.0%
Format
ValueCountFrequency (%)
30
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 871566
91.4%
Common 82521
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 143525
16.5%
s 92363
 
10.6%
n 71038
 
8.2%
l 57390
 
6.6%
d 51466
 
5.9%
r 46473
 
5.3%
u 38564
 
4.4%
o 37605
 
4.3%
i 37528
 
4.3%
I 34888
 
4.0%
Other values (46) 260726
29.9%
Common
ValueCountFrequency (%)
77436
93.8%
( 1764
 
2.1%
) 1764
 
2.1%
' 1144
 
1.4%
[ 182
 
0.2%
] 182
 
0.2%
30
 
< 0.1%
- 19
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 953947
> 99.9%
None 110
 
< 0.1%
Punctuation 30
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 143525
15.0%
s 92363
 
9.7%
77436
 
8.1%
n 71038
 
7.4%
l 57390
 
6.0%
d 51466
 
5.4%
r 46473
 
4.9%
u 38564
 
4.0%
o 37605
 
3.9%
i 37528
 
3.9%
Other values (48) 300559
31.5%
None
ValueCountFrequency (%)
Ō 78
70.9%
ñ 18
 
16.4%
ù 5
 
4.5%
à 5
 
4.5%
á 4
 
3.6%
Punctuation
ValueCountFrequency (%)
30
100.0%

island
Text

Missing 

Distinct2224
Distinct (%)1.2%
Missing270596
Missing (%)59.4%
Memory size3.5 MiB
2025-01-08T17:57:12.116786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length37
Mean length9.782494475
Min length3

Characters and Unicode

Total characters1806005
Distinct characters80
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique463 ?
Unique (%)0.3%

Sample

1st rowHonshu
2nd rowBarbados
3rd rowPutic Island
4th rowGuam
5th rowFlorida Island
ValueCountFrequency (%)
island 45621
 
15.8%
bermuda 14507
 
5.0%
atoll 13109
 
4.5%
luzon 7631
 
2.6%
oahu 6792
 
2.3%
cay 5201
 
1.8%
carrie 3799
 
1.3%
bow 3799
 
1.3%
new 3013
 
1.0%
cuba 2705
 
0.9%
Other values (2080) 182972
63.3%
2025-01-08T17:57:12.372934image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 268860
14.9%
n 133966
 
7.4%
o 116663
 
6.5%
l 110963
 
6.1%
104533
 
5.8%
u 90794
 
5.0%
e 88610
 
4.9%
i 86645
 
4.8%
r 86277
 
4.8%
d 84321
 
4.7%
Other values (70) 634373
35.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1403163
77.7%
Uppercase Letter 287699
 
15.9%
Space Separator 104533
 
5.8%
Open Punctuation 3752
 
0.2%
Close Punctuation 3752
 
0.2%
Other Punctuation 1868
 
0.1%
Dash Punctuation 1229
 
0.1%
Decimal Number 8
 
< 0.1%
Modifier Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 268860
19.2%
n 133966
9.5%
o 116663
8.3%
l 110963
7.9%
u 90794
 
6.5%
e 88610
 
6.3%
i 86645
 
6.2%
r 86277
 
6.1%
d 84321
 
6.0%
s 83762
 
6.0%
Other values (30) 252302
18.0%
Uppercase Letter
ValueCountFrequency (%)
I 50631
17.6%
B 35471
12.3%
C 23025
 
8.0%
M 22483
 
7.8%
A 21385
 
7.4%
S 16304
 
5.7%
T 14185
 
4.9%
L 13182
 
4.6%
O 10784
 
3.7%
N 10606
 
3.7%
Other values (17) 69643
24.2%
Other Punctuation
ValueCountFrequency (%)
' 1251
67.0%
. 551
29.5%
/ 39
 
2.1%
, 27
 
1.4%
Open Punctuation
ValueCountFrequency (%)
( 2827
75.3%
[ 925
 
24.7%
Close Punctuation
ValueCountFrequency (%)
) 2827
75.3%
] 925
 
24.7%
Decimal Number
ValueCountFrequency (%)
3 4
50.0%
0 4
50.0%
Space Separator
ValueCountFrequency (%)
104533
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1229
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690862
93.6%
Common 115143
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 268860
15.9%
n 133966
 
7.9%
o 116663
 
6.9%
l 110963
 
6.6%
u 90794
 
5.4%
e 88610
 
5.2%
i 86645
 
5.1%
r 86277
 
5.1%
d 84321
 
5.0%
s 83762
 
5.0%
Other values (57) 540001
31.9%
Common
ValueCountFrequency (%)
104533
90.8%
( 2827
 
2.5%
) 2827
 
2.5%
' 1251
 
1.1%
- 1229
 
1.1%
[ 925
 
0.8%
] 925
 
0.8%
. 551
 
0.5%
/ 39
 
< 0.1%
, 27
 
< 0.1%
Other values (3) 9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1805500
> 99.9%
None 488
 
< 0.1%
Latin Ext Additional 16
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 268860
14.9%
n 133966
 
7.4%
o 116663
 
6.5%
l 110963
 
6.1%
104533
 
5.8%
u 90794
 
5.0%
e 88610
 
4.9%
i 86645
 
4.8%
r 86277
 
4.8%
d 84321
 
4.7%
Other values (53) 633868
35.1%
None
ValueCountFrequency (%)
ó 101
20.7%
é 91
18.6%
á 85
17.4%
ñ 65
13.3%
Î 50
10.2%
ú 41
8.4%
í 17
 
3.5%
Á 12
 
2.5%
â 10
 
2.0%
ô 7
 
1.4%
Other values (4) 9
 
1.8%
Latin Ext Additional
ValueCountFrequency (%)
15
93.8%
1
 
6.2%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

countryCode
Text

Missing 

Distinct217
Distinct (%)0.1%
Missing30434
Missing (%)6.7%
Memory size3.5 MiB
2025-01-08T17:57:12.540220image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters849556
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowUS
2nd rowUS
3rd rowJP
4th rowUS
5th rowBB
ValueCountFrequency (%)
us 124564
29.3%
ph 46190
 
10.9%
bm 15821
 
3.7%
id 12805
 
3.0%
br 11602
 
2.7%
pa 10456
 
2.5%
pf 9998
 
2.4%
pg 7692
 
1.8%
jp 7188
 
1.7%
au 7086
 
1.7%
Other values (207) 171376
40.3%
2025-01-08T17:57:12.743259image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 147534
17.4%
S 146438
17.2%
P 91776
10.8%
H 58467
 
6.9%
B 50362
 
5.9%
M 50159
 
5.9%
C 28118
 
3.3%
A 26140
 
3.1%
I 23175
 
2.7%
T 22845
 
2.7%
Other values (16) 204542
24.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 849556
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 147534
17.4%
S 146438
17.2%
P 91776
10.8%
H 58467
 
6.9%
B 50362
 
5.9%
M 50159
 
5.9%
C 28118
 
3.3%
A 26140
 
3.1%
I 23175
 
2.7%
T 22845
 
2.7%
Other values (16) 204542
24.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 849556
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 147534
17.4%
S 146438
17.2%
P 91776
10.8%
H 58467
 
6.9%
B 50362
 
5.9%
M 50159
 
5.9%
C 28118
 
3.3%
A 26140
 
3.1%
I 23175
 
2.7%
T 22845
 
2.7%
Other values (16) 204542
24.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 849556
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 147534
17.4%
S 146438
17.2%
P 91776
10.8%
H 58467
 
6.9%
B 50362
 
5.9%
M 50159
 
5.9%
C 28118
 
3.3%
A 26140
 
3.1%
I 23175
 
2.7%
T 22845
 
2.7%
Other values (16) 204542
24.1%

stateProvince
Text

Missing 

Distinct1486
Distinct (%)0.5%
Missing174301
Missing (%)38.3%
Memory size3.5 MiB
2025-01-08T17:57:12.918243image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length48
Median length36
Mean length11.08342144
Min length3

Characters and Unicode

Total characters3113455
Distinct characters97
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique252 ?
Unique (%)0.1%

Sample

1st rowHawaii
2nd rowFlorida
3rd rowTokyo Prefecture
4th rowWest Virginia
5th rowPalawan
ValueCountFrequency (%)
province 30907
 
7.1%
florida 17201
 
4.0%
carolina 12504
 
2.9%
virginia 11495
 
2.7%
hawaii 10718
 
2.5%
north 9674
 
2.2%
region 9360
 
2.2%
south 8306
 
1.9%
maryland 7690
 
1.8%
islands 6712
 
1.6%
Other values (1479) 308426
71.2%
2025-01-08T17:57:13.161608image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 411155
13.2%
i 266017
 
8.5%
n 234755
 
7.5%
o 224195
 
7.2%
e 216314
 
6.9%
r 215084
 
6.9%
152082
 
4.9%
s 136931
 
4.4%
t 132525
 
4.3%
l 123796
 
4.0%
Other values (87) 1000601
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2526789
81.2%
Uppercase Letter 429904
 
13.8%
Space Separator 152082
 
4.9%
Dash Punctuation 2749
 
0.1%
Other Punctuation 1809
 
0.1%
Open Punctuation 58
 
< 0.1%
Close Punctuation 58
 
< 0.1%
Decimal Number 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 411155
16.3%
i 266017
10.5%
n 234755
9.3%
o 224195
8.9%
e 216314
8.6%
r 215084
8.5%
s 136931
 
5.4%
t 132525
 
5.2%
l 123796
 
4.9%
u 87847
 
3.5%
Other values (44) 478170
18.9%
Uppercase Letter
ValueCountFrequency (%)
P 57745
13.4%
M 38630
 
9.0%
S 38477
 
9.0%
C 36517
 
8.5%
N 28113
 
6.5%
T 23306
 
5.4%
A 22914
 
5.3%
D 18317
 
4.3%
F 18140
 
4.2%
R 16356
 
3.8%
Other values (22) 131389
30.6%
Other Punctuation
ValueCountFrequency (%)
' 907
50.1%
. 898
49.6%
, 2
 
0.1%
/ 2
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 2719
98.9%
30
 
1.1%
Decimal Number
ValueCountFrequency (%)
0 3
50.0%
1 3
50.0%
Space Separator
ValueCountFrequency (%)
152082
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 58
100.0%
Close Punctuation
ValueCountFrequency (%)
] 58
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2956693
95.0%
Common 156762
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 411155
13.9%
i 266017
 
9.0%
n 234755
 
7.9%
o 224195
 
7.6%
e 216314
 
7.3%
r 215084
 
7.3%
s 136931
 
4.6%
t 132525
 
4.5%
l 123796
 
4.2%
u 87847
 
3.0%
Other values (76) 908074
30.7%
Common
ValueCountFrequency (%)
152082
97.0%
- 2719
 
1.7%
' 907
 
0.6%
. 898
 
0.6%
[ 58
 
< 0.1%
] 58
 
< 0.1%
30
 
< 0.1%
0 3
 
< 0.1%
1 3
 
< 0.1%
, 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3108939
99.9%
None 4463
 
0.1%
Punctuation 30
 
< 0.1%
Latin Ext Additional 23
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 411155
13.2%
i 266017
 
8.6%
n 234755
 
7.6%
o 224195
 
7.2%
e 216314
 
7.0%
r 215084
 
6.9%
152082
 
4.9%
s 136931
 
4.4%
t 132525
 
4.3%
l 123796
 
4.0%
Other values (52) 996085
32.0%
None
ValueCountFrequency (%)
á 1932
43.3%
ó 1584
35.5%
í 532
 
11.9%
é 135
 
3.0%
ã 109
 
2.4%
ê 36
 
0.8%
Á 27
 
0.6%
è 20
 
0.4%
å 11
 
0.2%
É 10
 
0.2%
Other values (19) 67
 
1.5%
Punctuation
ValueCountFrequency (%)
30
100.0%
Latin Ext Additional
ValueCountFrequency (%)
5
21.7%
ế 5
21.7%
5
21.7%
4
17.4%
4
17.4%

county
Text

Missing 

Distinct2317
Distinct (%)2.4%
Missing357533
Missing (%)78.5%
Memory size3.5 MiB
2025-01-08T17:57:13.347397image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length46
Median length40
Mean length14.85270119
Min length3

Characters and Unicode

Total characters1450797
Distinct characters87
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique418 ?
Unique (%)0.4%

Sample

1st rowHillsborough County
2nd rowRandolph County
3rd rowThoothukudi District
4th rowCalvert County
5th rowNew Hanover County
ValueCountFrequency (%)
county 71646
34.9%
district 9126
 
4.4%
honolulu 5828
 
2.8%
monroe 3066
 
1.5%
parish 2197
 
1.1%
carteret 1943
 
0.9%
borough 1790
 
0.9%
san 1543
 
0.8%
montgomery 1350
 
0.7%
barnstable 1256
 
0.6%
Other values (2386) 105465
51.4%
2025-01-08T17:57:13.590032image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 144007
 
9.9%
o 142669
 
9.8%
t 126035
 
8.7%
u 110453
 
7.6%
107531
 
7.4%
a 92081
 
6.3%
C 86847
 
6.0%
y 83560
 
5.8%
e 75882
 
5.2%
r 67041
 
4.6%
Other values (77) 414691
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1134683
78.2%
Uppercase Letter 205688
 
14.2%
Space Separator 107531
 
7.4%
Other Punctuation 2008
 
0.1%
Dash Punctuation 823
 
0.1%
Open Punctuation 22
 
< 0.1%
Close Punctuation 22
 
< 0.1%
Decimal Number 14
 
< 0.1%
Modifier Letter 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 144007
12.7%
o 142669
12.6%
t 126035
11.1%
u 110453
9.7%
a 92081
8.1%
y 83560
7.4%
e 75882
6.7%
r 67041
 
5.9%
i 59185
 
5.2%
l 50106
 
4.4%
Other values (37) 183664
16.2%
Uppercase Letter
ValueCountFrequency (%)
C 86847
42.2%
M 15547
 
7.6%
D 12611
 
6.1%
H 9959
 
4.8%
B 9681
 
4.7%
P 9349
 
4.5%
S 8796
 
4.3%
A 7470
 
3.6%
L 6041
 
2.9%
W 5570
 
2.7%
Other values (19) 33817
 
16.4%
Other Punctuation
ValueCountFrequency (%)
' 1297
64.6%
. 653
32.5%
, 58
 
2.9%
Dash Punctuation
ValueCountFrequency (%)
- 669
81.3%
154
 
18.7%
Decimal Number
ValueCountFrequency (%)
2 13
92.9%
4 1
 
7.1%
Space Separator
ValueCountFrequency (%)
107531
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1340371
92.4%
Common 110426
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 144007
10.7%
o 142669
10.6%
t 126035
 
9.4%
u 110453
 
8.2%
a 92081
 
6.9%
C 86847
 
6.5%
y 83560
 
6.2%
e 75882
 
5.7%
r 67041
 
5.0%
i 59185
 
4.4%
Other values (66) 352611
26.3%
Common
ValueCountFrequency (%)
107531
97.4%
' 1297
 
1.2%
- 669
 
0.6%
. 653
 
0.6%
154
 
0.1%
, 58
 
0.1%
( 22
 
< 0.1%
) 22
 
< 0.1%
2 13
 
< 0.1%
ʻ 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1448457
99.8%
None 2167
 
0.1%
Punctuation 154
 
< 0.1%
Latin Ext Additional 13
 
< 0.1%
Modifier Letters 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 144007
 
9.9%
o 142669
 
9.8%
t 126035
 
8.7%
u 110453
 
7.6%
107531
 
7.4%
a 92081
 
6.4%
C 86847
 
6.0%
y 83560
 
5.8%
e 75882
 
5.2%
r 67041
 
4.6%
Other values (51) 412351
28.5%
None
ValueCountFrequency (%)
ó 969
44.7%
á 512
23.6%
í 396
18.3%
é 78
 
3.6%
ú 66
 
3.0%
Ø 51
 
2.4%
ü 40
 
1.8%
ñ 21
 
1.0%
ō 15
 
0.7%
ū 6
 
0.3%
Other values (10) 13
 
0.6%
Punctuation
ValueCountFrequency (%)
154
100.0%
Latin Ext Additional
ValueCountFrequency (%)
10
76.9%
1
 
7.7%
1
 
7.7%
ế 1
 
7.7%
Modifier Letters
ValueCountFrequency (%)
ʻ 6
100.0%

locality
Text

Missing 

Distinct63950
Distinct (%)15.6%
Missing45084
Missing (%)9.9%
Memory size3.5 MiB
2025-01-08T17:57:13.784440image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length653
Median length273
Mean length54.14135587
Min length1

Characters and Unicode

Total characters22204886
Distinct characters113
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31252 ?
Unique (%)7.6%

Sample

1st rowHawaii
2nd rowTampa, Florida
3rd rowTokyo, Japan
4th rowWest Virginia, Randolph County, Shaver's Fork at Cheat Bridge on US Route 250 (Durbin Quad)
5th rowNo Data
ValueCountFrequency (%)
of 176506
 
5.1%
island 102327
 
3.0%
islands 48966
 
1.4%
bay 45484
 
1.3%
river 43645
 
1.3%
reef 43102
 
1.2%
off 42172
 
1.2%
and 41056
 
1.2%
at 38682
 
1.1%
south 38327
 
1.1%
Other values (37098) 2831019
82.0%
2025-01-08T17:57:14.046800image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3041158
 
13.7%
a 2147970
 
9.7%
e 1545770
 
7.0%
o 1444005
 
6.5%
n 1302074
 
5.9%
i 1144156
 
5.2%
t 1048248
 
4.7%
r 1047962
 
4.7%
s 988467
 
4.5%
l 837964
 
3.8%
Other values (103) 7657112
34.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15649263
70.5%
Space Separator 3041158
 
13.7%
Uppercase Letter 2205461
 
9.9%
Other Punctuation 922659
 
4.2%
Decimal Number 240591
 
1.1%
Open Punctuation 53105
 
0.2%
Close Punctuation 53070
 
0.2%
Dash Punctuation 38007
 
0.2%
Math Symbol 1469
 
< 0.1%
Other Symbol 56
 
< 0.1%
Other values (7) 47
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2147970
13.7%
e 1545770
9.9%
o 1444005
9.2%
n 1302074
 
8.3%
i 1144156
 
7.3%
t 1048248
 
6.7%
r 1047962
 
6.7%
s 988467
 
6.3%
l 837964
 
5.4%
u 586811
 
3.7%
Other values (29) 3555836
22.7%
Uppercase Letter
ValueCountFrequency (%)
S 210973
 
9.6%
C 210043
 
9.5%
I 192891
 
8.7%
B 180023
 
8.2%
P 172477
 
7.8%
M 151758
 
6.9%
R 133803
 
6.1%
N 108616
 
4.9%
A 99700
 
4.5%
T 89914
 
4.1%
Other values (18) 655263
29.7%
Other Punctuation
ValueCountFrequency (%)
, 682754
74.0%
. 154051
 
16.7%
; 33355
 
3.6%
: 23885
 
2.6%
' 15305
 
1.7%
/ 6811
 
0.7%
" 3864
 
0.4%
? 1335
 
0.1%
# 479
 
0.1%
* 470
 
0.1%
Other values (3) 350
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 46824
19.5%
0 42199
17.5%
2 35959
14.9%
5 25415
10.6%
3 22663
9.4%
4 19158
8.0%
7 13345
 
5.5%
6 13208
 
5.5%
8 12180
 
5.1%
9 9640
 
4.0%
Math Symbol
ValueCountFrequency (%)
= 1198
81.6%
~ 135
 
9.2%
+ 132
 
9.0%
> 4
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 48046
90.5%
[ 5050
 
9.5%
{ 9
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 48008
90.5%
] 5056
 
9.5%
} 6
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 38004
> 99.9%
3
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
9
81.8%
2
 
18.2%
Modifier Symbol
ValueCountFrequency (%)
´ 2
66.7%
^ 1
33.3%
Space Separator
ValueCountFrequency (%)
3041158
100.0%
Other Symbol
ValueCountFrequency (%)
° 56
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 25
100.0%
Control
ValueCountFrequency (%)
 4
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%
Other Number
ValueCountFrequency (%)
½ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17854724
80.4%
Common 4350162
 
19.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2147970
 
12.0%
e 1545770
 
8.7%
o 1444005
 
8.1%
n 1302074
 
7.3%
i 1144156
 
6.4%
t 1048248
 
5.9%
r 1047962
 
5.9%
s 988467
 
5.5%
l 837964
 
4.7%
u 586811
 
3.3%
Other values (57) 5761297
32.3%
Common
ValueCountFrequency (%)
3041158
69.9%
, 682754
 
15.7%
. 154051
 
3.5%
( 48046
 
1.1%
) 48008
 
1.1%
1 46824
 
1.1%
0 42199
 
1.0%
- 38004
 
0.9%
2 35959
 
0.8%
; 33355
 
0.8%
Other values (36) 179804
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22204428
> 99.9%
None 441
 
< 0.1%
Punctuation 16
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3041158
 
13.7%
a 2147970
 
9.7%
e 1545770
 
7.0%
o 1444005
 
6.5%
n 1302074
 
5.9%
i 1144156
 
5.2%
t 1048248
 
4.7%
r 1047962
 
4.7%
s 988467
 
4.5%
l 837964
 
3.8%
Other values (79) 7656654
34.5%
None
ValueCountFrequency (%)
á 148
33.6%
é 71
16.1%
ã 67
15.2%
° 56
 
12.7%
ø 37
 
8.4%
à 14
 
3.2%
ó 14
 
3.2%
í 10
 
2.3%
Ù 5
 
1.1%
 4
 
0.9%
Other values (9) 15
 
3.4%
Punctuation
ValueCountFrequency (%)
9
56.2%
3
 
18.8%
2
 
12.5%
2
 
12.5%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

verbatimElevation
Text

Missing 

Distinct76
Distinct (%)3.4%
Missing453008
Missing (%)99.5%
Memory size3.5 MiB
2025-01-08T17:57:14.182469image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length152
Median length68
Mean length46.38838475
Min length3

Characters and Unicode

Total characters102240
Distinct characters70
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)0.6%

Sample

1st rowRotenone put out at 90' and 120', pickup was surface to 140', several (fiscos=factors?) prevented an even better collection.
2nd rowDistance from shore: 1000 feet
3rd row32 not found in field notes so could be inaccurate.
4th rowDistance from shore: 1500 feet
5th rowNaso was speared by P.W. (Paul D. West)
ValueCountFrequency (%)
feet 1680
 
9.0%
distance 1141
 
6.1%
from 1097
 
5.9%
to 1064
 
5.7%
shore 1048
 
5.6%
at 595
 
3.2%
499
 
2.7%
and 445
 
2.4%
rotenone 430
 
2.3%
put 309
 
1.6%
Other values (175) 10428
55.7%
2025-01-08T17:57:14.385169image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16532
16.2%
e 10266
 
10.0%
t 7363
 
7.2%
o 6811
 
6.7%
a 5065
 
5.0%
s 4792
 
4.7%
f 4205
 
4.1%
n 4144
 
4.1%
r 4103
 
4.0%
0 3353
 
3.3%
Other values (60) 35606
34.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 67443
66.0%
Space Separator 16532
 
16.2%
Decimal Number 7713
 
7.5%
Uppercase Letter 5089
 
5.0%
Other Punctuation 3746
 
3.7%
Dash Punctuation 742
 
0.7%
Open Punctuation 462
 
0.5%
Close Punctuation 462
 
0.5%
Math Symbol 51
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 10266
15.2%
t 7363
10.9%
o 6811
10.1%
a 5065
 
7.5%
s 4792
 
7.1%
f 4205
 
6.2%
n 4144
 
6.1%
r 4103
 
6.1%
i 3239
 
4.8%
c 2531
 
3.8%
Other values (15) 14924
22.1%
Uppercase Letter
ValueCountFrequency (%)
D 1545
30.4%
T 692
13.6%
P 592
 
11.6%
W 574
 
11.3%
A 431
 
8.5%
R 430
 
8.4%
C 159
 
3.1%
N 135
 
2.7%
G 90
 
1.8%
V 77
 
1.5%
Other values (10) 364
 
7.2%
Decimal Number
ValueCountFrequency (%)
0 3353
43.5%
1 1286
 
16.7%
5 905
 
11.7%
2 800
 
10.4%
7 383
 
5.0%
6 218
 
2.8%
8 215
 
2.8%
3 210
 
2.7%
4 199
 
2.6%
9 144
 
1.9%
Other Punctuation
ValueCountFrequency (%)
. 1561
41.7%
: 1444
38.5%
, 292
 
7.8%
' 270
 
7.2%
" 97
 
2.6%
? 51
 
1.4%
; 30
 
0.8%
/ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 318
68.8%
[ 144
31.2%
Close Punctuation
ValueCountFrequency (%)
) 318
68.8%
] 144
31.2%
Space Separator
ValueCountFrequency (%)
16532
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 742
100.0%
Math Symbol
ValueCountFrequency (%)
= 51
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 72532
70.9%
Common 29708
29.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 10266
14.2%
t 7363
 
10.2%
o 6811
 
9.4%
a 5065
 
7.0%
s 4792
 
6.6%
f 4205
 
5.8%
n 4144
 
5.7%
r 4103
 
5.7%
i 3239
 
4.5%
c 2531
 
3.5%
Other values (35) 20013
27.6%
Common
ValueCountFrequency (%)
16532
55.6%
0 3353
 
11.3%
. 1561
 
5.3%
: 1444
 
4.9%
1 1286
 
4.3%
5 905
 
3.0%
2 800
 
2.7%
- 742
 
2.5%
7 383
 
1.3%
( 318
 
1.1%
Other values (15) 2384
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 102240
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16532
16.2%
e 10266
 
10.0%
t 7363
 
7.2%
o 6811
 
6.7%
a 5065
 
5.0%
s 4792
 
4.7%
f 4205
 
4.1%
n 4144
 
4.1%
r 4103
 
4.0%
0 3353
 
3.3%
Other values (60) 35606
34.8%

verbatimDepth
Text

Missing 

Distinct230
Distinct (%)2.7%
Missing446636
Missing (%)98.1%
Memory size3.5 MiB
2025-01-08T17:57:14.551206image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length72
Median length67
Mean length8.249766791
Min length1

Characters and Unicode

Total characters70750
Distinct characters76
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique95 ?
Unique (%)1.1%

Sample

1st rowDepth trawl: 135 fathoms
2nd row15 minutes at depth
3rd rowSurface
4th rowCA
5th row15 minutes at depth
ValueCountFrequency (%)
ca 3930
25.9%
surface 2351
15.5%
depth 865
 
5.7%
at 571
 
3.8%
00000000 543
 
3.6%
to 505
 
3.3%
minutes 343
 
2.3%
fathoms 330
 
2.2%
m 320
 
2.1%
trawl 287
 
1.9%
Other values (305) 5114
33.7%
2025-01-08T17:57:14.789461image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 7671
 
10.8%
6583
 
9.3%
e 5141
 
7.3%
a 4475
 
6.3%
t 4124
 
5.8%
A 4103
 
5.8%
C 3949
 
5.6%
r 3468
 
4.9%
f 3202
 
4.5%
u 3133
 
4.4%
Other values (66) 24901
35.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 39537
55.9%
Uppercase Letter 11262
 
15.9%
Decimal Number 10956
 
15.5%
Space Separator 6583
 
9.3%
Other Punctuation 1980
 
2.8%
Dash Punctuation 370
 
0.5%
Math Symbol 32
 
< 0.1%
Open Punctuation 26
 
< 0.1%
Close Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5141
13.0%
a 4475
11.3%
t 4124
10.4%
r 3468
8.8%
f 3202
8.1%
u 3133
 
7.9%
c 2518
 
6.4%
o 2218
 
5.6%
h 1721
 
4.4%
s 1672
 
4.2%
Other values (16) 7865
19.9%
Uppercase Letter
ValueCountFrequency (%)
A 4103
36.4%
C 3949
35.1%
S 2270
20.2%
D 171
 
1.5%
O 167
 
1.5%
T 137
 
1.2%
M 98
 
0.9%
H 88
 
0.8%
I 57
 
0.5%
B 46
 
0.4%
Other values (14) 176
 
1.6%
Decimal Number
ValueCountFrequency (%)
0 7671
70.0%
5 776
 
7.1%
1 744
 
6.8%
3 517
 
4.7%
2 450
 
4.1%
6 241
 
2.2%
9 179
 
1.6%
7 147
 
1.3%
8 119
 
1.1%
4 112
 
1.0%
Other Punctuation
ValueCountFrequency (%)
. 699
35.3%
, 459
23.2%
' 405
20.5%
: 177
 
8.9%
; 128
 
6.5%
" 106
 
5.4%
# 4
 
0.2%
? 1
 
0.1%
* 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
< 23
71.9%
= 8
 
25.0%
~ 1
 
3.1%
Space Separator
ValueCountFrequency (%)
6583
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 370
100.0%
Open Punctuation
ValueCountFrequency (%)
( 26
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 50799
71.8%
Common 19951
 
28.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5141
 
10.1%
a 4475
 
8.8%
t 4124
 
8.1%
A 4103
 
8.1%
C 3949
 
7.8%
r 3468
 
6.8%
f 3202
 
6.3%
u 3133
 
6.2%
c 2518
 
5.0%
S 2270
 
4.5%
Other values (40) 14416
28.4%
Common
ValueCountFrequency (%)
0 7671
38.4%
6583
33.0%
5 776
 
3.9%
1 744
 
3.7%
. 699
 
3.5%
3 517
 
2.6%
, 459
 
2.3%
2 450
 
2.3%
' 405
 
2.0%
- 370
 
1.9%
Other values (16) 1277
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 70750
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7671
 
10.8%
6583
 
9.3%
e 5141
 
7.3%
a 4475
 
6.3%
t 4124
 
5.8%
A 4103
 
5.8%
C 3949
 
5.6%
r 3468
 
4.9%
f 3202
 
4.5%
u 3133
 
4.4%
Other values (66) 24901
35.2%

minimumDistanceAboveSurfaceInMeters
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455211
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:14.840088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters20
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowWilliams, Jeffrey T.
ValueCountFrequency (%)
williams 1
33.3%
jeffrey 1
33.3%
t 1
33.3%
2025-01-08T17:57:14.931338image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
 
10.0%
l 2
 
10.0%
2
 
10.0%
e 2
 
10.0%
f 2
 
10.0%
W 1
 
5.0%
a 1
 
5.0%
m 1
 
5.0%
s 1
 
5.0%
, 1
 
5.0%
Other values (5) 5
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13
65.0%
Uppercase Letter 3
 
15.0%
Space Separator 2
 
10.0%
Other Punctuation 2
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
15.4%
l 2
15.4%
e 2
15.4%
f 2
15.4%
a 1
7.7%
m 1
7.7%
s 1
7.7%
r 1
7.7%
y 1
7.7%
Uppercase Letter
ValueCountFrequency (%)
W 1
33.3%
J 1
33.3%
T 1
33.3%
Other Punctuation
ValueCountFrequency (%)
, 1
50.0%
. 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16
80.0%
Common 4
 
20.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
12.5%
l 2
12.5%
e 2
12.5%
f 2
12.5%
W 1
6.2%
a 1
6.2%
m 1
6.2%
s 1
6.2%
J 1
6.2%
r 1
6.2%
Other values (2) 2
12.5%
Common
ValueCountFrequency (%)
2
50.0%
, 1
25.0%
. 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
 
10.0%
l 2
 
10.0%
2
 
10.0%
e 2
 
10.0%
f 2
 
10.0%
W 1
 
5.0%
a 1
 
5.0%
m 1
 
5.0%
s 1
 
5.0%
, 1
 
5.0%
Other values (5) 5
25.0%

decimalLatitude
Text

Missing 

Distinct15632
Distinct (%)7.8%
Missing254257
Missing (%)55.9%
Memory size3.5 MiB
2025-01-08T17:57:15.113466image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length6.000413028
Min length3

Characters and Unicode

Total characters1205813
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4363 ?
Unique (%)2.2%

Sample

1st row13.2431
2nd row10.9181
3rd row31.93
4th row10.72
5th row-2.0517
ValueCountFrequency (%)
12.5 1211
 
0.6%
27.9 868
 
0.4%
16.8 711
 
0.4%
12.0832 620
 
0.3%
21.417 545
 
0.3%
19.1606 541
 
0.3%
32.23 510
 
0.3%
32.17 503
 
0.3%
32.3 491
 
0.2%
28.4933 489
 
0.2%
Other values (14220) 194466
96.8%
2025-01-08T17:57:15.375167image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 200955
16.7%
3 140183
11.6%
1 139665
11.6%
2 127101
10.5%
8 92280
7.7%
7 91691
7.6%
5 84666
7.0%
4 71044
 
5.9%
6 69792
 
5.8%
9 69560
 
5.8%
Other values (2) 118876
9.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 947134
78.5%
Other Punctuation 200955
 
16.7%
Dash Punctuation 57724
 
4.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 140183
14.8%
1 139665
14.7%
2 127101
13.4%
8 92280
9.7%
7 91691
9.7%
5 84666
8.9%
4 71044
7.5%
6 69792
7.4%
9 69560
7.3%
0 61152
6.5%
Other Punctuation
ValueCountFrequency (%)
. 200955
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 57724
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1205813
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 200955
16.7%
3 140183
11.6%
1 139665
11.6%
2 127101
10.5%
8 92280
7.7%
7 91691
7.6%
5 84666
7.0%
4 71044
 
5.9%
6 69792
 
5.8%
9 69560
 
5.8%
Other values (2) 118876
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1205813
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 200955
16.7%
3 140183
11.6%
1 139665
11.6%
2 127101
10.5%
8 92280
7.7%
7 91691
7.6%
5 84666
7.0%
4 71044
 
5.9%
6 69792
 
5.8%
9 69560
 
5.8%
Other values (2) 118876
9.9%

decimalLongitude
Text

Missing 

Distinct17148
Distinct (%)8.5%
Missing254257
Missing (%)55.9%
Memory size3.5 MiB
2025-01-08T17:57:15.584478image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length10
Mean length6.717578562
Min length3

Characters and Unicode

Total characters1349931
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5089 ?
Unique (%)2.5%

Sample

1st row-59.6561
2nd row121.034
3rd row-63.95
4th row-67.88
5th row130.107
ValueCountFrequency (%)
177.083 872
 
0.4%
93.717 815
 
0.4%
88.08 737
 
0.4%
68.8991 618
 
0.3%
64.0 564
 
0.3%
158.417 546
 
0.3%
179.756 541
 
0.3%
162.875 490
 
0.2%
165.83 469
 
0.2%
84.9317 454
 
0.2%
Other values (16304) 194849
97.0%
2025-01-08T17:57:15.852548image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 200955
14.9%
1 164710
12.2%
7 129040
9.6%
- 125692
9.3%
8 119399
8.8%
3 100095
7.4%
6 99897
7.4%
2 97895
7.3%
5 93363
6.9%
4 79165
 
5.9%
Other values (2) 139720
10.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1023284
75.8%
Other Punctuation 200955
 
14.9%
Dash Punctuation 125692
 
9.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 164710
16.1%
7 129040
12.6%
8 119399
11.7%
3 100095
9.8%
6 99897
9.8%
2 97895
9.6%
5 93363
9.1%
4 79165
7.7%
9 76344
7.5%
0 63376
 
6.2%
Other Punctuation
ValueCountFrequency (%)
. 200955
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 125692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1349931
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 200955
14.9%
1 164710
12.2%
7 129040
9.6%
- 125692
9.3%
8 119399
8.8%
3 100095
7.4%
6 99897
7.4%
2 97895
7.3%
5 93363
6.9%
4 79165
 
5.9%
Other values (2) 139720
10.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1349931
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 200955
14.9%
1 164710
12.2%
7 129040
9.6%
- 125692
9.3%
8 119399
8.8%
3 100095
7.4%
6 99897
7.4%
2 97895
7.3%
5 93363
6.9%
4 79165
 
5.9%
Other values (2) 139720
10.4%
Distinct220
Distinct (%)4.3%
Missing450059
Missing (%)98.9%
Memory size3.5 MiB
2025-01-08T17:57:16.015762image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length5.768872501
Min length4

Characters and Unicode

Total characters29727
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)0.7%

Sample

1st row100.0
2nd row457.0
3rd row739.0
4th row100.0
5th row8438.0
ValueCountFrequency (%)
100.0 1109
21.5%
10000.0 832
 
16.1%
3704.0 209
 
4.1%
500.0 188
 
3.6%
5000.0 122
 
2.4%
278076.0 107
 
2.1%
441.0 89
 
1.7%
330.0 83
 
1.6%
50.0 78
 
1.5%
3512.0 73
 
1.4%
Other values (210) 2263
43.9%
2025-01-08T17:57:16.240161image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 13130
44.2%
. 5153
 
17.3%
1 3226
 
10.9%
2 1526
 
5.1%
4 1387
 
4.7%
3 1195
 
4.0%
5 1160
 
3.9%
6 830
 
2.8%
7 776
 
2.6%
8 720
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24574
82.7%
Other Punctuation 5153
 
17.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 13130
53.4%
1 3226
 
13.1%
2 1526
 
6.2%
4 1387
 
5.6%
3 1195
 
4.9%
5 1160
 
4.7%
6 830
 
3.4%
7 776
 
3.2%
8 720
 
2.9%
9 624
 
2.5%
Other Punctuation
ValueCountFrequency (%)
. 5153
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 29727
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 13130
44.2%
. 5153
 
17.3%
1 3226
 
10.9%
2 1526
 
5.1%
4 1387
 
4.7%
3 1195
 
4.0%
5 1160
 
3.9%
6 830
 
2.8%
7 776
 
2.6%
8 720
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29727
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 13130
44.2%
. 5153
 
17.3%
1 3226
 
10.9%
2 1526
 
5.1%
4 1387
 
4.7%
3 1195
 
4.0%
5 1160
 
3.9%
6 830
 
2.8%
7 776
 
2.6%
8 720
 
2.4%

pointRadiusSpatialFit
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:16.307160image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters49
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st row2341036
2nd row2384468
3rd row2353475
4th row2373066
5th row2414948
ValueCountFrequency (%)
2341036 1
14.3%
2384468 1
14.3%
2353475 1
14.3%
2373066 1
14.3%
2414948 1
14.3%
2393782 1
14.3%
2335095 1
14.3%
2025-01-08T17:57:16.411068image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 49
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Common 49
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%
Distinct3
Distinct (%)< 0.1%
Missing308939
Missing (%)67.9%
Memory size3.5 MiB
2025-01-08T17:57:16.459749image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.92758746
Min length7

Characters and Unicode

Total characters3353687
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 146265
33.4%
minutes 144957
33.1%
seconds 144957
33.1%
decimal 1308
 
0.3%
unknown 8
 
< 0.1%
2025-01-08T17:57:16.568524image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 730017
21.8%
s 436179
13.0%
291222
 
8.7%
n 289938
 
8.6%
D 146265
 
4.4%
c 146265
 
4.4%
g 146265
 
4.4%
r 146265
 
4.4%
d 146265
 
4.4%
i 146265
 
4.4%
Other values (11) 728741
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2626278
78.3%
Uppercase Letter 436187
 
13.0%
Space Separator 291222
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 730017
27.8%
s 436179
16.6%
n 289938
 
11.0%
c 146265
 
5.6%
g 146265
 
5.6%
r 146265
 
5.6%
d 146265
 
5.6%
i 146265
 
5.6%
o 144965
 
5.5%
t 144957
 
5.5%
Other values (6) 148897
 
5.7%
Uppercase Letter
ValueCountFrequency (%)
D 146265
33.5%
S 144957
33.2%
M 144957
33.2%
U 8
 
< 0.1%
Space Separator
ValueCountFrequency (%)
291222
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3062465
91.3%
Common 291222
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 730017
23.8%
s 436179
14.2%
n 289938
 
9.5%
D 146265
 
4.8%
c 146265
 
4.8%
g 146265
 
4.8%
r 146265
 
4.8%
d 146265
 
4.8%
i 146265
 
4.8%
o 144965
 
4.7%
Other values (10) 583776
19.1%
Common
ValueCountFrequency (%)
291222
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3353687
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 730017
21.8%
s 436179
13.0%
291222
 
8.7%
n 289938
 
8.6%
D 146265
 
4.4%
c 146265
 
4.4%
g 146265
 
4.4%
r 146265
 
4.4%
d 146265
 
4.4%
i 146265
 
4.4%
Other values (11) 728741
21.7%

georeferencedBy
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:16.640541image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length40
Median length36
Mean length35.85714286
Min length33

Characters and Unicode

Total characters251
Distinct characters47
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowNoturus nocturnus Jordan & Gilbert, 1886
2nd rowThalassoma lunare (Linnaeus, 1758)
3rd rowBrycon falcatus Müller & Troschel, 1844
4th rowPseudotropheus elongatus Fryer, 1956
5th rowHalieutaea brevicauda Ogilby, 1910
ValueCountFrequency (%)
2
 
6.2%
noturus 1
 
3.1%
elongatus 1
 
3.1%
pallas 1
 
3.1%
cirrhosus 1
 
3.1%
blepsias 1
 
3.1%
1840 1
 
3.1%
valenciennes 1
 
3.1%
globiceps 1
 
3.1%
scarus 1
 
3.1%
Other values (21) 21
65.6%
2025-01-08T17:57:16.773314image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25
 
10.0%
a 19
 
7.6%
s 18
 
7.2%
e 17
 
6.8%
r 15
 
6.0%
l 15
 
6.0%
u 14
 
5.6%
n 11
 
4.4%
o 11
 
4.4%
c 9
 
3.6%
Other values (37) 97
38.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 169
67.3%
Decimal Number 28
 
11.2%
Space Separator 25
 
10.0%
Uppercase Letter 16
 
6.4%
Other Punctuation 9
 
3.6%
Close Punctuation 2
 
0.8%
Open Punctuation 2
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 19
11.2%
s 18
10.7%
e 17
10.1%
r 15
8.9%
l 15
8.9%
u 14
8.3%
n 11
 
6.5%
o 11
 
6.5%
c 9
 
5.3%
i 9
 
5.3%
Other values (11) 31
18.3%
Uppercase Letter
ValueCountFrequency (%)
B 2
12.5%
P 2
12.5%
T 2
12.5%
O 1
 
6.2%
F 1
 
6.2%
S 1
 
6.2%
H 1
 
6.2%
N 1
 
6.2%
M 1
 
6.2%
L 1
 
6.2%
Other values (3) 3
18.8%
Decimal Number
ValueCountFrequency (%)
1 9
32.1%
8 6
21.4%
4 4
14.3%
9 2
 
7.1%
5 2
 
7.1%
6 2
 
7.1%
0 2
 
7.1%
7 1
 
3.6%
Other Punctuation
ValueCountFrequency (%)
, 7
77.8%
& 2
 
22.2%
Space Separator
ValueCountFrequency (%)
25
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 185
73.7%
Common 66
 
26.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 19
10.3%
s 18
 
9.7%
e 17
 
9.2%
r 15
 
8.1%
l 15
 
8.1%
u 14
 
7.6%
n 11
 
5.9%
o 11
 
5.9%
c 9
 
4.9%
i 9
 
4.9%
Other values (24) 47
25.4%
Common
ValueCountFrequency (%)
25
37.9%
1 9
 
13.6%
, 7
 
10.6%
8 6
 
9.1%
4 4
 
6.1%
9 2
 
3.0%
) 2
 
3.0%
5 2
 
3.0%
( 2
 
3.0%
6 2
 
3.0%
Other values (3) 5
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 250
99.6%
None 1
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
25
 
10.0%
a 19
 
7.6%
s 18
 
7.2%
e 17
 
6.8%
r 15
 
6.0%
l 15
 
6.0%
u 14
 
5.6%
n 11
 
4.4%
o 11
 
4.4%
c 9
 
3.6%
Other values (36) 96
38.4%
None
ValueCountFrequency (%)
ü 1
100.0%

georeferenceProtocol
Text

Missing 

Distinct16
Distinct (%)0.1%
Missing437832
Missing (%)96.2%
Memory size3.5 MiB
2025-01-08T17:57:16.852306image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length125
Median length96
Mean length19.25863061
Min length3

Characters and Unicode

Total characters334715
Distinct characters61
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGPS
2nd rowOn-line Gazetteer
3rd rowDifferential GPS
4th rowGuide to Best Practices for Georeferencing. (Chapman and Wieczorek, eds. 2006). Google Earth Pro
5th rowChart
ValueCountFrequency (%)
chart 6339
 
11.9%
gps 6318
 
11.9%
google 3627
 
6.8%
earth 3256
 
6.1%
georeferencing 2448
 
4.6%
and 2426
 
4.6%
pro 2399
 
4.5%
2006 2399
 
4.5%
wieczorek 2399
 
4.5%
eds 2399
 
4.5%
Other values (37) 19229
36.1%
2025-01-08T17:57:16.987745image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35859
 
10.7%
e 34001
 
10.2%
r 26699
 
8.0%
a 21993
 
6.6%
t 20226
 
6.0%
o 19852
 
5.9%
G 16342
 
4.9%
n 13437
 
4.0%
h 12548
 
3.7%
i 12159
 
3.6%
Other values (51) 121599
36.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 217719
65.0%
Uppercase Letter 53904
 
16.1%
Space Separator 35859
 
10.7%
Other Punctuation 10427
 
3.1%
Decimal Number 10339
 
3.1%
Open Punctuation 2475
 
0.7%
Close Punctuation 2475
 
0.7%
Dash Punctuation 1235
 
0.4%
Math Symbol 282
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 34001
15.6%
r 26699
12.3%
a 21993
10.1%
t 20226
9.3%
o 19852
9.1%
n 13437
 
6.2%
h 12548
 
5.8%
i 12159
 
5.6%
c 10049
 
4.6%
s 7688
 
3.5%
Other values (15) 39067
17.9%
Uppercase Letter
ValueCountFrequency (%)
G 16342
30.3%
P 11268
20.9%
C 8787
16.3%
S 6378
 
11.8%
E 3538
 
6.6%
W 2399
 
4.5%
B 2399
 
4.5%
O 1228
 
2.3%
M 344
 
0.6%
R 331
 
0.6%
Other values (9) 890
 
1.7%
Decimal Number
ValueCountFrequency (%)
0 5063
49.0%
2 2573
24.9%
6 2399
23.2%
1 179
 
1.7%
3 49
 
0.5%
5 49
 
0.5%
4 27
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 7420
71.2%
, 2502
 
24.0%
; 206
 
2.0%
/ 201
 
1.9%
: 98
 
0.9%
Space Separator
ValueCountFrequency (%)
35859
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2475
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2475
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1235
100.0%
Math Symbol
ValueCountFrequency (%)
+ 282
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 271623
81.2%
Common 63092
 
18.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 34001
12.5%
r 26699
 
9.8%
a 21993
 
8.1%
t 20226
 
7.4%
o 19852
 
7.3%
G 16342
 
6.0%
n 13437
 
4.9%
h 12548
 
4.6%
i 12159
 
4.5%
P 11268
 
4.1%
Other values (34) 83098
30.6%
Common
ValueCountFrequency (%)
35859
56.8%
. 7420
 
11.8%
0 5063
 
8.0%
2 2573
 
4.1%
, 2502
 
4.0%
( 2475
 
3.9%
) 2475
 
3.9%
6 2399
 
3.8%
- 1235
 
2.0%
+ 282
 
0.4%
Other values (7) 809
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 334715
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
35859
 
10.7%
e 34001
 
10.2%
r 26699
 
8.0%
a 21993
 
6.6%
t 20226
 
6.0%
o 19852
 
5.9%
G 16342
 
4.9%
n 13437
 
4.0%
h 12548
 
3.7%
i 12159
 
3.6%
Other values (51) 121599
36.3%

georeferenceRemarks
Text

Missing 

Distinct135
Distinct (%)0.6%
Missing432197
Missing (%)94.9%
Memory size3.5 MiB
2025-01-08T17:57:17.118125image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length158
Median length2
Mean length7.226026504
Min length1

Characters and Unicode

Total characters166307
Distinct characters60
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)0.3%

Sample

1st rowStart; End
2nd rowca
3rd rowCA
4th rowCA
5th rowCA
ValueCountFrequency (%)
ca 18410
46.3%
start 2530
 
6.4%
end 2436
 
6.1%
bank 1768
 
4.4%
flower 1768
 
4.4%
garden 1768
 
4.4%
for 977
 
2.5%
west 940
 
2.4%
east 828
 
2.1%
coordinates 580
 
1.5%
Other values (263) 7789
19.6%
2025-01-08T17:57:17.324892image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 17102
 
10.3%
16779
 
10.1%
A 16547
 
9.9%
a 12099
 
7.3%
t 11571
 
7.0%
n 10340
 
6.2%
e 9905
 
6.0%
r 8710
 
5.2%
o 7884
 
4.7%
d 6230
 
3.7%
Other values (50) 49140
29.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 93445
56.2%
Uppercase Letter 47863
28.8%
Space Separator 16779
 
10.1%
Other Punctuation 6140
 
3.7%
Decimal Number 2011
 
1.2%
Dash Punctuation 65
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12099
12.9%
t 11571
12.4%
n 10340
11.1%
e 9905
10.6%
r 8710
9.3%
o 7884
8.4%
d 6230
6.7%
l 5770
6.2%
i 4482
 
4.8%
c 4310
 
4.6%
Other values (13) 12144
13.0%
Uppercase Letter
ValueCountFrequency (%)
C 17102
35.7%
A 16547
34.6%
E 3265
 
6.8%
S 2817
 
5.9%
G 2363
 
4.9%
B 1987
 
4.2%
F 1797
 
3.8%
W 1100
 
2.3%
O 216
 
0.5%
T 165
 
0.3%
Other values (7) 504
 
1.1%
Decimal Number
ValueCountFrequency (%)
1 400
19.9%
3 335
16.7%
6 216
10.7%
9 198
9.8%
8 196
9.7%
4 177
8.8%
2 164
8.2%
5 128
 
6.4%
0 123
 
6.1%
7 74
 
3.7%
Other Punctuation
ValueCountFrequency (%)
; 4208
68.5%
. 1334
 
21.7%
, 592
 
9.6%
" 4
 
0.1%
/ 1
 
< 0.1%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
16779
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 65
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 141308
85.0%
Common 24999
 
15.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 17102
12.1%
A 16547
11.7%
a 12099
 
8.6%
t 11571
 
8.2%
n 10340
 
7.3%
e 9905
 
7.0%
r 8710
 
6.2%
o 7884
 
5.6%
d 6230
 
4.4%
l 5770
 
4.1%
Other values (30) 35150
24.9%
Common
ValueCountFrequency (%)
16779
67.1%
; 4208
 
16.8%
. 1334
 
5.3%
, 592
 
2.4%
1 400
 
1.6%
3 335
 
1.3%
6 216
 
0.9%
9 198
 
0.8%
8 196
 
0.8%
4 177
 
0.7%
Other values (10) 564
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 166307
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 17102
 
10.3%
16779
 
10.1%
A 16547
 
9.9%
a 12099
 
7.3%
t 11571
 
7.0%
n 10340
 
6.2%
e 9905
 
6.0%
r 8710
 
5.2%
o 7884
 
4.7%
d 6230
 
3.7%
Other values (50) 49140
29.5%
Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:17.398514image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length141
Median length134
Mean length126.7142857
Min length114

Characters and Unicode

Total characters887
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Ostariophysi, Siluriformes, Ictaluridae
2nd rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Labroidei, Labridae
3rd rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Ostariophysi, Characiformes, Characidae
4th rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Labroidei, Cichlidae
5th rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Paracanthopterygii, Lophiiformes, Ogcocephalioidei, Ogcocephalidae
ValueCountFrequency (%)
animalia 7
10.1%
vertebrata 7
10.1%
osteichthyes 7
10.1%
actinopterygii 7
10.1%
neopterygii 7
10.1%
chordata 7
10.1%
acanthopterygii 4
 
5.8%
perciformes 3
 
4.3%
labroidei 3
 
4.3%
ostariophysi 2
 
2.9%
Other values (15) 15
21.7%
2025-01-08T17:57:17.525321image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 101
 
11.4%
e 82
 
9.2%
a 73
 
8.2%
t 73
 
8.2%
r 66
 
7.4%
, 62
 
7.0%
62
 
7.0%
o 45
 
5.1%
h 34
 
3.8%
c 33
 
3.7%
Other values (21) 256
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 694
78.2%
Uppercase Letter 69
 
7.8%
Other Punctuation 62
 
7.0%
Space Separator 62
 
7.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 101
14.6%
e 82
11.8%
a 73
10.5%
t 73
10.5%
r 66
9.5%
o 45
 
6.5%
h 34
 
4.9%
c 33
 
4.8%
y 28
 
4.0%
p 26
 
3.7%
Other values (9) 133
19.2%
Uppercase Letter
ValueCountFrequency (%)
A 18
26.1%
O 11
15.9%
C 11
15.9%
V 7
 
10.1%
N 7
 
10.1%
L 5
 
7.2%
S 4
 
5.8%
P 4
 
5.8%
I 1
 
1.4%
H 1
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 62
100.0%
Space Separator
ValueCountFrequency (%)
62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 763
86.0%
Common 124
 
14.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 101
13.2%
e 82
10.7%
a 73
 
9.6%
t 73
 
9.6%
r 66
 
8.7%
o 45
 
5.9%
h 34
 
4.5%
c 33
 
4.3%
y 28
 
3.7%
p 26
 
3.4%
Other values (19) 202
26.5%
Common
ValueCountFrequency (%)
, 62
50.0%
62
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 887
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 101
 
11.4%
e 82
 
9.2%
a 73
 
8.2%
t 73
 
8.2%
r 66
 
7.4%
, 62
 
7.0%
62
 
7.0%
o 45
 
5.1%
h 34
 
3.8%
c 33
 
3.7%
Other values (21) 256
28.9%

earliestEraOrLowestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:17.572322image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters56
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 7
100.0%
2025-01-08T17:57:17.770142image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 14
25.0%
a 14
25.0%
A 7
12.5%
n 7
12.5%
m 7
12.5%
l 7
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 49
87.5%
Uppercase Letter 7
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 14
28.6%
a 14
28.6%
n 7
14.3%
m 7
14.3%
l 7
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 56
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 14
25.0%
a 14
25.0%
A 7
12.5%
n 7
12.5%
m 7
12.5%
l 7
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 14
25.0%
a 14
25.0%
A 7
12.5%
n 7
12.5%
m 7
12.5%
l 7
12.5%

latestEraOrHighestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:17.809144image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters56
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChordata
2nd rowChordata
3rd rowChordata
4th rowChordata
5th rowChordata
ValueCountFrequency (%)
chordata 7
100.0%
2025-01-08T17:57:17.896470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 14
25.0%
C 7
12.5%
h 7
12.5%
o 7
12.5%
r 7
12.5%
d 7
12.5%
t 7
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 49
87.5%
Uppercase Letter 7
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 14
28.6%
h 7
14.3%
o 7
14.3%
r 7
14.3%
d 7
14.3%
t 7
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 56
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 14
25.0%
C 7
12.5%
h 7
12.5%
o 7
12.5%
r 7
12.5%
d 7
12.5%
t 7
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 14
25.0%
C 7
12.5%
h 7
12.5%
o 7
12.5%
r 7
12.5%
d 7
12.5%
t 7
12.5%
Distinct5
Distinct (%)71.4%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:17.942689image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length13
Mean length12.14285714
Min length11

Characters and Unicode

Total characters85
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)57.1%

Sample

1st rowSiluriformes
2nd rowPerciformes
3rd rowCharaciformes
4th rowPerciformes
5th rowLophiiformes
ValueCountFrequency (%)
perciformes 3
42.9%
siluriformes 1
 
14.3%
characiformes 1
 
14.3%
lophiiformes 1
 
14.3%
scorpaeniformes 1
 
14.3%
2025-01-08T17:57:18.047042image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 13
15.3%
e 11
12.9%
i 9
10.6%
o 9
10.6%
f 7
8.2%
m 7
8.2%
s 7
8.2%
c 5
 
5.9%
P 3
 
3.5%
a 3
 
3.5%
Other values (8) 11
12.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 78
91.8%
Uppercase Letter 7
 
8.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 13
16.7%
e 11
14.1%
i 9
11.5%
o 9
11.5%
f 7
9.0%
m 7
9.0%
s 7
9.0%
c 5
 
6.4%
a 3
 
3.8%
h 2
 
2.6%
Other values (4) 5
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
P 3
42.9%
S 2
28.6%
C 1
 
14.3%
L 1
 
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 85
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 13
15.3%
e 11
12.9%
i 9
10.6%
o 9
10.6%
f 7
8.2%
m 7
8.2%
s 7
8.2%
c 5
 
5.9%
P 3
 
3.5%
a 3
 
3.5%
Other values (8) 11
12.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 85
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 13
15.3%
e 11
12.9%
i 9
10.6%
o 9
10.6%
f 7
8.2%
m 7
8.2%
s 7
8.2%
c 5
 
5.9%
P 3
 
3.5%
a 3
 
3.5%
Other values (8) 11
12.9%
Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:18.101043image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length11
Mean length10.71428571
Min length8

Characters and Unicode

Total characters75
Distinct characters24
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowIctaluridae
2nd rowLabridae
3rd rowBryconidae
4th rowCichlidae
5th rowOgcocephalidae
ValueCountFrequency (%)
ictaluridae 1
14.3%
labridae 1
14.3%
bryconidae 1
14.3%
cichlidae 1
14.3%
ogcocephalidae 1
14.3%
scaridae 1
14.3%
hemitripteridae 1
14.3%
2025-01-08T17:57:18.216312image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 11
14.7%
i 10
13.3%
e 10
13.3%
d 7
9.3%
r 6
 
8.0%
c 6
 
8.0%
t 3
 
4.0%
l 3
 
4.0%
h 2
 
2.7%
p 2
 
2.7%
Other values (14) 15
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 68
90.7%
Uppercase Letter 7
 
9.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 11
16.2%
i 10
14.7%
e 10
14.7%
d 7
10.3%
r 6
8.8%
c 6
8.8%
t 3
 
4.4%
l 3
 
4.4%
h 2
 
2.9%
p 2
 
2.9%
Other values (7) 8
11.8%
Uppercase Letter
ValueCountFrequency (%)
I 1
14.3%
H 1
14.3%
S 1
14.3%
O 1
14.3%
B 1
14.3%
C 1
14.3%
L 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 75
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 11
14.7%
i 10
13.3%
e 10
13.3%
d 7
9.3%
r 6
 
8.0%
c 6
 
8.0%
t 3
 
4.0%
l 3
 
4.0%
h 2
 
2.7%
p 2
 
2.7%
Other values (14) 15
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 75
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 11
14.7%
i 10
13.3%
e 10
13.3%
d 7
9.3%
r 6
 
8.0%
c 6
 
8.0%
t 3
 
4.0%
l 3
 
4.0%
h 2
 
2.7%
p 2
 
2.7%
Other values (14) 15
20.0%
Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:18.272443image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length10
Mean length8.714285714
Min length6

Characters and Unicode

Total characters61
Distinct characters22
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowNoturus
2nd rowThalassoma
3rd rowBrycon
4th rowPseudotropheus
5th rowHalieutaea
ValueCountFrequency (%)
noturus 1
14.3%
thalassoma 1
14.3%
brycon 1
14.3%
pseudotropheus 1
14.3%
halieutaea 1
14.3%
scarus 1
14.3%
blepsias 1
14.3%
2025-01-08T17:57:18.388702image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 8
13.1%
a 8
13.1%
u 6
 
9.8%
e 5
 
8.2%
o 5
 
8.2%
r 4
 
6.6%
t 3
 
4.9%
l 3
 
4.9%
B 2
 
3.3%
i 2
 
3.3%
Other values (12) 15
24.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 54
88.5%
Uppercase Letter 7
 
11.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 8
14.8%
a 8
14.8%
u 6
11.1%
e 5
9.3%
o 5
9.3%
r 4
7.4%
t 3
 
5.6%
l 3
 
5.6%
i 2
 
3.7%
h 2
 
3.7%
Other values (6) 8
14.8%
Uppercase Letter
ValueCountFrequency (%)
B 2
28.6%
H 1
14.3%
N 1
14.3%
P 1
14.3%
T 1
14.3%
S 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 61
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 8
13.1%
a 8
13.1%
u 6
 
9.8%
e 5
 
8.2%
o 5
 
8.2%
r 4
 
6.6%
t 3
 
4.9%
l 3
 
4.9%
B 2
 
3.3%
i 2
 
3.3%
Other values (12) 15
24.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 61
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 8
13.1%
a 8
13.1%
u 6
 
9.8%
e 5
 
8.2%
o 5
 
8.2%
r 4
 
6.6%
t 3
 
4.9%
l 3
 
4.9%
B 2
 
3.3%
i 2
 
3.3%
Other values (12) 15
24.6%
Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:18.444811image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length10
Mean length8.714285714
Min length6

Characters and Unicode

Total characters61
Distinct characters22
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowNoturus
2nd rowThalassoma
3rd rowBrycon
4th rowPseudotropheus
5th rowHalieutaea
ValueCountFrequency (%)
noturus 1
14.3%
thalassoma 1
14.3%
brycon 1
14.3%
pseudotropheus 1
14.3%
halieutaea 1
14.3%
scarus 1
14.3%
blepsias 1
14.3%
2025-01-08T17:57:18.562708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 8
13.1%
a 8
13.1%
u 6
 
9.8%
e 5
 
8.2%
o 5
 
8.2%
r 4
 
6.6%
t 3
 
4.9%
l 3
 
4.9%
B 2
 
3.3%
i 2
 
3.3%
Other values (12) 15
24.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 54
88.5%
Uppercase Letter 7
 
11.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 8
14.8%
a 8
14.8%
u 6
11.1%
e 5
9.3%
o 5
9.3%
r 4
7.4%
t 3
 
5.6%
l 3
 
5.6%
i 2
 
3.7%
h 2
 
3.7%
Other values (6) 8
14.8%
Uppercase Letter
ValueCountFrequency (%)
B 2
28.6%
H 1
14.3%
N 1
14.3%
P 1
14.3%
T 1
14.3%
S 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 61
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 8
13.1%
a 8
13.1%
u 6
 
9.8%
e 5
 
8.2%
o 5
 
8.2%
r 4
 
6.6%
t 3
 
4.9%
l 3
 
4.9%
B 2
 
3.3%
i 2
 
3.3%
Other values (12) 15
24.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 61
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 8
13.1%
a 8
13.1%
u 6
 
9.8%
e 5
 
8.2%
o 5
 
8.2%
r 4
 
6.6%
t 3
 
4.9%
l 3
 
4.9%
B 2
 
3.3%
i 2
 
3.3%
Other values (12) 15
24.6%

member
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:18.618396image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length8.571428571
Min length6

Characters and Unicode

Total characters60
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rownocturnus
2nd rowlunare
3rd rowfalcatus
4th rowelongatus
5th rowbrevicauda
ValueCountFrequency (%)
nocturnus 1
14.3%
lunare 1
14.3%
falcatus 1
14.3%
elongatus 1
14.3%
brevicauda 1
14.3%
globiceps 1
14.3%
cirrhosus 1
14.3%
2025-01-08T17:57:18.733958image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
u 7
11.7%
a 6
10.0%
s 6
10.0%
c 5
8.3%
r 5
8.3%
n 4
 
6.7%
o 4
 
6.7%
e 4
 
6.7%
l 4
 
6.7%
t 3
 
5.0%
Other values (8) 12
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 60
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 7
11.7%
a 6
10.0%
s 6
10.0%
c 5
8.3%
r 5
8.3%
n 4
 
6.7%
o 4
 
6.7%
e 4
 
6.7%
l 4
 
6.7%
t 3
 
5.0%
Other values (8) 12
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 60
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 7
11.7%
a 6
10.0%
s 6
10.0%
c 5
8.3%
r 5
8.3%
n 4
 
6.7%
o 4
 
6.7%
e 4
 
6.7%
l 4
 
6.7%
t 3
 
5.0%
Other values (8) 12
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u 7
11.7%
a 6
10.0%
s 6
10.0%
c 5
8.3%
r 5
8.3%
n 4
 
6.7%
o 4
 
6.7%
e 4
 
6.7%
l 4
 
6.7%
t 3
 
5.0%
Other values (8) 12
20.0%

verbatimIdentification
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:18.775958image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters49
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSPECIES
2nd rowSPECIES
3rd rowSPECIES
4th rowSPECIES
5th rowSPECIES
ValueCountFrequency (%)
species 7
100.0%
2025-01-08T17:57:18.865106image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 14
28.6%
E 14
28.6%
P 7
14.3%
C 7
14.3%
I 7
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 49
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 14
28.6%
E 14
28.6%
P 7
14.3%
C 7
14.3%
I 7
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 49
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 14
28.6%
E 14
28.6%
P 7
14.3%
C 7
14.3%
I 7
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 14
28.6%
E 14
28.6%
P 7
14.3%
C 7
14.3%
I 7
14.3%
Distinct5
Distinct (%)0.3%
Missing453516
Missing (%)99.6%
Memory size3.5 MiB
2025-01-08T17:57:18.905106image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length3
Mean length5.780660377
Min length3

Characters and Unicode

Total characters9804
Distinct characters11
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcf.
2nd rowuncertain
3rd rowuncertain
4th rowuncertain
5th rownear
ValueCountFrequency (%)
cf 895
52.8%
uncertain 783
46.2%
aff 14
 
0.8%
near 4
 
0.2%
2025-01-08T17:57:18.994599image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 1678
17.1%
n 1570
16.0%
f 923
9.4%
. 909
9.3%
a 801
8.2%
e 787
8.0%
r 787
8.0%
t 783
8.0%
i 783
8.0%
u 652
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8764
89.4%
Other Punctuation 909
 
9.3%
Uppercase Letter 131
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 1678
19.1%
n 1570
17.9%
f 923
10.5%
a 801
9.1%
e 787
9.0%
r 787
9.0%
t 783
8.9%
i 783
8.9%
u 652
 
7.4%
Other Punctuation
ValueCountFrequency (%)
. 909
100.0%
Uppercase Letter
ValueCountFrequency (%)
U 131
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8895
90.7%
Common 909
 
9.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 1678
18.9%
n 1570
17.7%
f 923
10.4%
a 801
9.0%
e 787
8.8%
r 787
8.8%
t 783
8.8%
i 783
8.8%
u 652
 
7.3%
U 131
 
1.5%
Common
ValueCountFrequency (%)
. 909
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9804
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 1678
17.1%
n 1570
16.0%
f 923
9.4%
. 909
9.3%
a 801
8.2%
e 787
8.0%
r 787
8.0%
t 783
8.0%
i 783
8.0%
u 652
 
6.7%

typeStatus
Text

Missing 

Distinct9
Distinct (%)< 0.1%
Missing436448
Missing (%)95.9%
Memory size3.5 MiB
2025-01-08T17:57:19.039638image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8
Mean length7.670219569
Min length4

Characters and Unicode

Total characters143924
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPARATYPE
2nd rowHOLOTYPE
3rd rowPARATYPE
4th rowPARATYPE
5th rowCOTYPE
ValueCountFrequency (%)
paratype 12437
66.3%
holotype 3339
 
17.8%
type 1470
 
7.8%
syntype 819
 
4.4%
cotype 296
 
1.6%
paralectotype 207
 
1.1%
lectotype 127
 
0.7%
neotype 59
 
0.3%
allotype 10
 
0.1%
2025-01-08T17:57:19.136034image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
P 31408
21.8%
A 25298
17.6%
Y 19583
13.6%
E 19157
13.3%
T 19098
13.3%
R 12644
8.8%
O 7377
 
5.1%
L 3693
 
2.6%
H 3339
 
2.3%
N 878
 
0.6%
Other values (2) 1449
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 143924
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 31408
21.8%
A 25298
17.6%
Y 19583
13.6%
E 19157
13.3%
T 19098
13.3%
R 12644
8.8%
O 7377
 
5.1%
L 3693
 
2.6%
H 3339
 
2.3%
N 878
 
0.6%
Other values (2) 1449
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 143924
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 31408
21.8%
A 25298
17.6%
Y 19583
13.6%
E 19157
13.3%
T 19098
13.3%
R 12644
8.8%
O 7377
 
5.1%
L 3693
 
2.6%
H 3339
 
2.3%
N 878
 
0.6%
Other values (2) 1449
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 143924
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 31408
21.8%
A 25298
17.6%
Y 19583
13.6%
E 19157
13.3%
T 19098
13.3%
R 12644
8.8%
O 7377
 
5.1%
L 3693
 
2.6%
H 3339
 
2.3%
N 878
 
0.6%
Other values (2) 1449
 
1.0%

identifiedBy
Text

Missing 

Distinct572
Distinct (%)1.7%
Missing421073
Missing (%)92.5%
Memory size3.5 MiB
2025-01-08T17:57:19.303961image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length147
Median length137
Mean length21.13904918
Min length5

Characters and Unicode

Total characters721666
Distinct characters69
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique143 ?
Unique (%)0.4%

Sample

1st rowPezold, Frank; Larson, Helen K.
2nd rowWilliams, Jeffrey T.
3rd rowWilliams, Jeffrey T.
4th rowEschmeyer, William N.
5th rowKarnella, Susan J.
ValueCountFrequency (%)
williams 6495
 
5.8%
jeffrey 6367
 
5.7%
t 6366
 
5.7%
e 4376
 
3.9%
david 4213
 
3.8%
g 4044
 
3.6%
smith 3785
 
3.4%
c 2656
 
2.4%
pitassy 2526
 
2.3%
diane 2526
 
2.3%
Other values (967) 68435
61.2%
2025-01-08T17:57:19.560386image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
77650
 
10.8%
a 55232
 
7.7%
i 54043
 
7.5%
e 50214
 
7.0%
, 37806
 
5.2%
r 34027
 
4.7%
l 32102
 
4.4%
n 30823
 
4.3%
. 26916
 
3.7%
t 26854
 
3.7%
Other values (59) 295999
41.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 443204
61.4%
Uppercase Letter 127586
 
17.7%
Space Separator 77650
 
10.8%
Other Punctuation 66171
 
9.2%
Dash Punctuation 2357
 
0.3%
Close Punctuation 2272
 
0.3%
Open Punctuation 2272
 
0.3%
Final Punctuation 77
 
< 0.1%
Initial Punctuation 77
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 55232
12.5%
i 54043
12.2%
e 50214
11.3%
r 34027
 
7.7%
l 32102
 
7.2%
n 30823
 
7.0%
t 26854
 
6.1%
s 22669
 
5.1%
o 22269
 
5.0%
m 19827
 
4.5%
Other values (21) 95144
21.5%
Uppercase Letter
ValueCountFrequency (%)
T 11485
 
9.0%
D 10253
 
8.0%
J 9551
 
7.5%
S 9476
 
7.4%
W 9319
 
7.3%
C 8939
 
7.0%
E 7478
 
5.9%
A 6920
 
5.4%
H 6036
 
4.7%
G 5572
 
4.4%
Other values (16) 42557
33.4%
Other Punctuation
ValueCountFrequency (%)
, 37806
57.1%
. 26916
40.7%
; 1042
 
1.6%
/ 387
 
0.6%
' 18
 
< 0.1%
& 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
77650
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2357
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2272
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2272
100.0%
Final Punctuation
ValueCountFrequency (%)
77
100.0%
Initial Punctuation
ValueCountFrequency (%)
77
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 570790
79.1%
Common 150876
 
20.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 55232
 
9.7%
i 54043
 
9.5%
e 50214
 
8.8%
r 34027
 
6.0%
l 32102
 
5.6%
n 30823
 
5.4%
t 26854
 
4.7%
s 22669
 
4.0%
o 22269
 
3.9%
m 19827
 
3.5%
Other values (47) 222730
39.0%
Common
ValueCountFrequency (%)
77650
51.5%
, 37806
25.1%
. 26916
 
17.8%
- 2357
 
1.6%
) 2272
 
1.5%
( 2272
 
1.5%
; 1042
 
0.7%
/ 387
 
0.3%
77
 
0.1%
77
 
0.1%
Other values (2) 20
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 721467
> 99.9%
Punctuation 154
 
< 0.1%
None 45
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
77650
 
10.8%
a 55232
 
7.7%
i 54043
 
7.5%
e 50214
 
7.0%
, 37806
 
5.2%
r 34027
 
4.7%
l 32102
 
4.4%
n 30823
 
4.3%
. 26916
 
3.7%
t 26854
 
3.7%
Other values (51) 295800
41.0%
Punctuation
ValueCountFrequency (%)
77
50.0%
77
50.0%
None
ValueCountFrequency (%)
ñ 31
68.9%
á 6
 
13.3%
Ö 2
 
4.4%
ü 2
 
4.4%
í 2
 
4.4%
ê 2
 
4.4%

identifiedByID
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:19.609385image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters56
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowACCEPTED
2nd rowACCEPTED
3rd rowACCEPTED
4th rowACCEPTED
5th rowACCEPTED
ValueCountFrequency (%)
accepted 7
100.0%
2025-01-08T17:57:19.700748image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 14
25.0%
E 14
25.0%
A 7
12.5%
P 7
12.5%
T 7
12.5%
D 7
12.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 56
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 14
25.0%
E 14
25.0%
A 7
12.5%
P 7
12.5%
T 7
12.5%
D 7
12.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 56
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 14
25.0%
E 14
25.0%
A 7
12.5%
P 7
12.5%
T 7
12.5%
D 7
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 14
25.0%
E 14
25.0%
A 7
12.5%
P 7
12.5%
T 7
12.5%
D 7
12.5%

identificationVerificationStatus
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:19.752106image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters252
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
2nd row821cc27a-e3bb-4bc5-ac34-89ada245069d
3rd row821cc27a-e3bb-4bc5-ac34-89ada245069d
4th row821cc27a-e3bb-4bc5-ac34-89ada245069d
5th row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 7
100.0%
2025-01-08T17:57:19.856366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 28
11.1%
a 28
11.1%
- 28
11.1%
2 21
8.3%
b 21
8.3%
4 21
8.3%
8 14
 
5.6%
3 14
 
5.6%
5 14
 
5.6%
9 14
 
5.6%
Other values (6) 49
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 126
50.0%
Lowercase Letter 98
38.9%
Dash Punctuation 28
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 21
16.7%
4 21
16.7%
8 14
11.1%
3 14
11.1%
5 14
11.1%
9 14
11.1%
1 7
 
5.6%
7 7
 
5.6%
0 7
 
5.6%
6 7
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
c 28
28.6%
a 28
28.6%
b 21
21.4%
d 14
14.3%
e 7
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 154
61.1%
Latin 98
38.9%

Most frequent character per script

Common
ValueCountFrequency (%)
- 28
18.2%
2 21
13.6%
4 21
13.6%
8 14
9.1%
3 14
9.1%
5 14
9.1%
9 14
9.1%
1 7
 
4.5%
7 7
 
4.5%
0 7
 
4.5%
Latin
ValueCountFrequency (%)
c 28
28.6%
a 28
28.6%
b 21
21.4%
d 14
14.3%
e 7
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 252
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 28
11.1%
a 28
11.1%
- 28
11.1%
2 21
8.3%
b 21
8.3%
4 21
8.3%
8 14
 
5.6%
3 14
 
5.6%
5 14
 
5.6%
9 14
 
5.6%
Other values (6) 49
19.4%

identificationRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:19.894367image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters14
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 7
100.0%
2025-01-08T17:57:19.978470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 7
50.0%
S 7
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 14
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 7
50.0%
S 7
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 7
50.0%
S 7
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 7
50.0%
S 7
50.0%

taxonID
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:20.031318image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters168
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st row2024-12-02T13:57:35.184Z
2nd row2024-12-02T13:58:31.286Z
3rd row2024-12-02T13:56:38.525Z
4th row2024-12-02T13:59:43.862Z
5th row2024-12-02T13:56:47.781Z
ValueCountFrequency (%)
2024-12-02t13:57:35.184z 1
14.3%
2024-12-02t13:58:31.286z 1
14.3%
2024-12-02t13:56:38.525z 1
14.3%
2024-12-02t13:59:43.862z 1
14.3%
2024-12-02t13:56:47.781z 1
14.3%
2024-12-02t13:59:40.809z 1
14.3%
2024-12-02t13:58:46.380z 1
14.3%
2025-01-08T17:57:20.134648image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 31
18.5%
0 17
10.1%
1 17
10.1%
- 14
8.3%
: 14
8.3%
4 12
 
7.1%
3 12
 
7.1%
5 10
 
6.0%
8 9
 
5.4%
T 7
 
4.2%
Other values (5) 25
14.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 119
70.8%
Other Punctuation 21
 
12.5%
Dash Punctuation 14
 
8.3%
Uppercase Letter 14
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 31
26.1%
0 17
14.3%
1 17
14.3%
4 12
 
10.1%
3 12
 
10.1%
5 10
 
8.4%
8 9
 
7.6%
6 5
 
4.2%
7 3
 
2.5%
9 3
 
2.5%
Other Punctuation
ValueCountFrequency (%)
: 14
66.7%
. 7
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 7
50.0%
Z 7
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 154
91.7%
Latin 14
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 31
20.1%
0 17
11.0%
1 17
11.0%
- 14
9.1%
: 14
9.1%
4 12
 
7.8%
3 12
 
7.8%
5 10
 
6.5%
8 9
 
5.8%
. 7
 
4.5%
Other values (3) 11
 
7.1%
Latin
ValueCountFrequency (%)
T 7
50.0%
Z 7
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 168
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 31
18.5%
0 17
10.1%
1 17
10.1%
- 14
8.3%
: 14
8.3%
4 12
 
7.1%
3 12
 
7.1%
5 10
 
6.0%
8 9
 
5.4%
T 7
 
4.2%
Other values (5) 25
14.9%
Distinct22054
Distinct (%)4.8%
Missing211
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:20.324165image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.847697038
Min length2

Characters and Unicode

Total characters3115709
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4768 ?
Unique (%)1.0%

Sample

1st row5213106
2nd row7822511
3rd row5209001
4th row2359811
5th row2369651
ValueCountFrequency (%)
4274 1630
 
0.4%
2360481 1121
 
0.2%
2359014 1113
 
0.2%
2359823 1006
 
0.2%
2376138 1001
 
0.2%
2366967 904
 
0.2%
2367736 893
 
0.2%
2394503 857
 
0.2%
2361357 853
 
0.2%
2358931 760
 
0.2%
Other values (22044) 444863
97.8%
2025-01-08T17:57:20.581124image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 599111
19.2%
3 470277
15.1%
4 304204
9.8%
5 294417
9.4%
8 259388
8.3%
0 253770
8.1%
9 251072
8.1%
1 239346
 
7.7%
7 223101
 
7.2%
6 221023
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3115709
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 599111
19.2%
3 470277
15.1%
4 304204
9.8%
5 294417
9.4%
8 259388
8.3%
0 253770
8.1%
9 251072
8.1%
1 239346
 
7.7%
7 223101
 
7.2%
6 221023
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 3115709
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 599111
19.2%
3 470277
15.1%
4 304204
9.8%
5 294417
9.4%
8 259388
8.3%
0 253770
8.1%
9 251072
8.1%
1 239346
 
7.7%
7 223101
 
7.2%
6 221023
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3115709
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 599111
19.2%
3 470277
15.1%
4 304204
9.8%
5 294417
9.4%
8 259388
8.3%
0 253770
8.1%
9 251072
8.1%
1 239346
 
7.7%
7 223101
 
7.2%
6 221023
 
7.1%

parentNameUsageID
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing455209
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:20.633930image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters9
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row0.5
2nd row8.5
3rd row2.0
ValueCountFrequency (%)
0.5 1
33.3%
8.5 1
33.3%
2.0 1
33.3%
2025-01-08T17:57:20.721102image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 3
33.3%
0 2
22.2%
5 2
22.2%
8 1
 
11.1%
2 1
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
66.7%
Other Punctuation 3
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2
33.3%
5 2
33.3%
8 1
16.7%
2 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 3
33.3%
0 2
22.2%
5 2
22.2%
8 1
 
11.1%
2 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 3
33.3%
0 2
22.2%
5 2
22.2%
8 1
 
11.1%
2 1
 
11.1%

originalNameUsageID
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing455209
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:20.760090image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters9
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st row0.5
2nd row0.5
3rd row2.0
ValueCountFrequency (%)
0.5 2
66.7%
2.0 1
33.3%
2025-01-08T17:57:20.849479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3
33.3%
. 3
33.3%
5 2
22.2%
2 1
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
66.7%
Other Punctuation 3
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3
50.0%
5 2
33.3%
2 1
 
16.7%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3
33.3%
. 3
33.3%
5 2
22.2%
2 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3
33.3%
. 3
33.3%
5 2
22.2%
2 1
 
11.1%

namePublishedInID
Text

Missing 

Distinct3
Distinct (%)42.9%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:20.905702image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length153
Median length48
Mean length86.42857143
Min length48

Characters and Unicode

Total characters605
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)14.3%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_DERIVED_FROM_COORDINATES;CONTINENT_INVALID
3rd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
5th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;GEODETIC_DATUM_INVALID;CONTINENT_DERIVED_FROM_COORDINATES;CONTINENT_INVALID
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count 4
57.1%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;continent_invalid 2
28.6%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;geodetic_datum_invalid;continent_derived_from_coordinates;continent_invalid 1
 
14.3%
2025-01-08T17:57:21.007749image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 58
9.6%
E 54
 
8.9%
N 53
 
8.8%
I 52
 
8.6%
D 45
 
7.4%
T 44
 
7.3%
R 44
 
7.3%
C 41
 
6.8%
O 40
 
6.6%
U 35
 
5.8%
Other values (11) 139
23.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 531
87.8%
Connector Punctuation 58
 
9.6%
Other Punctuation 10
 
1.7%
Decimal Number 6
 
1.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 54
10.2%
N 53
10.0%
I 52
9.8%
D 45
8.5%
T 44
8.3%
R 44
8.3%
C 41
7.7%
O 40
7.5%
U 35
 
6.6%
A 28
 
5.3%
Other values (7) 95
17.9%
Decimal Number
ValueCountFrequency (%)
8 3
50.0%
4 3
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 58
100.0%
Other Punctuation
ValueCountFrequency (%)
; 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 531
87.8%
Common 74
 
12.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 54
10.2%
N 53
10.0%
I 52
9.8%
D 45
8.5%
T 44
8.3%
R 44
8.3%
C 41
7.7%
O 40
7.5%
U 35
 
6.6%
A 28
 
5.3%
Other values (7) 95
17.9%
Common
ValueCountFrequency (%)
_ 58
78.4%
; 10
 
13.5%
8 3
 
4.1%
4 3
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 605
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 58
9.6%
E 54
 
8.9%
N 53
 
8.8%
I 52
 
8.6%
D 45
 
7.4%
T 44
 
7.3%
R 44
 
7.3%
C 41
 
6.8%
O 40
 
6.6%
U 35
 
5.8%
Other values (11) 139
23.0%

taxonConceptID
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing455210
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:21.050161image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters20
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowStillImage
2nd rowStillImage
ValueCountFrequency (%)
stillimage 2
100.0%
2025-01-08T17:57:21.137929image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 4
20.0%
S 2
10.0%
t 2
10.0%
i 2
10.0%
I 2
10.0%
m 2
10.0%
a 2
10.0%
g 2
10.0%
e 2
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16
80.0%
Uppercase Letter 4
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 4
25.0%
t 2
12.5%
i 2
12.5%
m 2
12.5%
a 2
12.5%
g 2
12.5%
e 2
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 2
50.0%
I 2
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 4
20.0%
S 2
10.0%
t 2
10.0%
i 2
10.0%
I 2
10.0%
m 2
10.0%
a 2
10.0%
g 2
10.0%
e 2
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 4
20.0%
S 2
10.0%
t 2
10.0%
i 2
10.0%
I 2
10.0%
m 2
10.0%
a 2
10.0%
g 2
10.0%
e 2
10.0%
Distinct28366
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:21.310283image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length111
Median length81
Mean length34.33414102
Min length4

Characters and Unicode

Total characters15629313
Distinct characters91
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8055 ?
Unique (%)1.8%

Sample

1st rowEchidna nebulosa (Ahl, 1789)
2nd rowMugil Linnaeus, 1758
3rd rowCryptocentrus filifer (Valenciennes, 1837)
4th rowRhinichthys cataractae (Valenciennes, 1842)
5th rowCentropomus ensiferus Poey, 1860
ValueCountFrequency (%)
74427
 
4.0%
linnaeus 26768
 
1.4%
bleeker 23949
 
1.3%
1758 20993
 
1.1%
valenciennes 20020
 
1.1%
cuvier 18941
 
1.0%
jordan 16870
 
0.9%
bloch 15687
 
0.8%
lacepède 13855
 
0.7%
1801 13309
 
0.7%
Other values (20362) 1613420
86.8%
2025-01-08T17:57:21.561033image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1403027
 
9.0%
e 1045274
 
6.7%
a 1034703
 
6.6%
i 926974
 
5.9%
s 920415
 
5.9%
n 772411
 
4.9%
r 768150
 
4.9%
o 766823
 
4.9%
u 641083
 
4.1%
l 587298
 
3.8%
Other values (81) 6763155
43.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10580971
67.7%
Decimal Number 1707640
 
10.9%
Space Separator 1403027
 
9.0%
Uppercase Letter 966521
 
6.2%
Other Punctuation 504834
 
3.2%
Open Punctuation 231890
 
1.5%
Close Punctuation 231890
 
1.5%
Dash Punctuation 2540
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1045274
9.9%
a 1034703
9.8%
i 926974
 
8.8%
s 920415
 
8.7%
n 772411
 
7.3%
r 768150
 
7.3%
o 766823
 
7.2%
u 641083
 
6.1%
l 587298
 
5.6%
t 578646
 
5.5%
Other values (35) 2539194
24.0%
Uppercase Letter
ValueCountFrequency (%)
C 100059
10.4%
S 98835
10.2%
B 89015
 
9.2%
L 85231
 
8.8%
G 84945
 
8.8%
P 66777
 
6.9%
A 54030
 
5.6%
R 50242
 
5.2%
M 48437
 
5.0%
E 42560
 
4.4%
Other values (18) 246390
25.5%
Decimal Number
ValueCountFrequency (%)
1 499218
29.2%
8 355270
20.8%
9 178960
 
10.5%
7 131050
 
7.7%
5 111806
 
6.5%
0 107029
 
6.3%
2 87774
 
5.1%
6 85320
 
5.0%
3 84836
 
5.0%
4 66377
 
3.9%
Other Punctuation
ValueCountFrequency (%)
, 430063
85.2%
& 74427
 
14.7%
. 270
 
0.1%
' 74
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1403027
100.0%
Open Punctuation
ValueCountFrequency (%)
( 231890
100.0%
Close Punctuation
ValueCountFrequency (%)
) 231890
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2540
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11547492
73.9%
Common 4081821
 
26.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1045274
 
9.1%
a 1034703
 
9.0%
i 926974
 
8.0%
s 920415
 
8.0%
n 772411
 
6.7%
r 768150
 
6.7%
o 766823
 
6.6%
u 641083
 
5.6%
l 587298
 
5.1%
t 578646
 
5.0%
Other values (63) 3505715
30.4%
Common
ValueCountFrequency (%)
1403027
34.4%
1 499218
 
12.2%
, 430063
 
10.5%
8 355270
 
8.7%
( 231890
 
5.7%
) 231890
 
5.7%
9 178960
 
4.4%
7 131050
 
3.2%
5 111806
 
2.7%
0 107029
 
2.6%
Other values (8) 401618
 
9.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15573317
99.6%
None 55996
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1403027
 
9.0%
e 1045274
 
6.7%
a 1034703
 
6.6%
i 926974
 
6.0%
s 920415
 
5.9%
n 772411
 
5.0%
r 768150
 
4.9%
o 766823
 
4.9%
u 641083
 
4.1%
l 587298
 
3.8%
Other values (60) 6707159
43.1%
None
ValueCountFrequency (%)
ü 24263
43.3%
è 13883
24.8%
å 11605
20.7%
ö 3033
 
5.4%
é 1849
 
3.3%
ø 571
 
1.0%
á 277
 
0.5%
ó 163
 
0.3%
ă 147
 
0.3%
ç 62
 
0.1%
Other values (11) 143
 
0.3%

acceptedNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:21.611072image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters35
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 7
100.0%
2025-01-08T17:57:21.697300image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 7
20.0%
a 7
20.0%
l 7
20.0%
s 7
20.0%
e 7
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 35
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 7
20.0%
a 7
20.0%
l 7
20.0%
s 7
20.0%
e 7
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 35
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 7
20.0%
a 7
20.0%
l 7
20.0%
s 7
20.0%
e 7
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 7
20.0%
a 7
20.0%
l 7
20.0%
s 7
20.0%
e 7
20.0%

parentNameUsage
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:21.747883image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters49
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st row2341036
2nd row2384468
3rd row2353475
4th row2373066
5th row2414948
ValueCountFrequency (%)
2341036 1
14.3%
2384468 1
14.3%
2353475 1
14.3%
2373066 1
14.3%
2414948 1
14.3%
2393782 1
14.3%
2335095 1
14.3%
2025-01-08T17:57:21.854736image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 49
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Common 49
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

originalNameUsage
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:21.910736image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters49
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st row2341036
2nd row2384468
3rd row2353475
4th row2373066
5th row2414948
ValueCountFrequency (%)
2341036 1
14.3%
2384468 1
14.3%
2353475 1
14.3%
2373066 1
14.3%
2414948 1
14.3%
2393782 1
14.3%
2335095 1
14.3%
2025-01-08T17:57:22.016031image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 49
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Common 49
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

nameAccordingTo
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:22.057563image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters7
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 7
100.0%
2025-01-08T17:57:22.141796image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 7
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 7
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 7
100.0%

namePublishedIn
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:22.178302image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters14
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row44
2nd row44
3rd row44
4th row44
5th row44
ValueCountFrequency (%)
44 7
100.0%
2025-01-08T17:57:22.368890image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 14
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 14
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 14
100.0%
Distinct868
Distinct (%)0.2%
Missing231
Missing (%)0.1%
Memory size3.5 MiB
2025-01-08T17:57:22.535091image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length164
Median length155
Mean length131.5133379
Min length3

Characters and Unicode

Total characters59836070
Distinct characters58
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique71 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Elopomorpha, Anguilliformes, Muraenoidei, Muraenidae, Muraeninae
2nd rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Percoidei, Mugilidae
3rd rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Gobioidei, Gobiidae, Gobiinae
4th rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Ostariophysi, Cypriniformes, Cyprinidae
5th rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Percoidei, Centropomidae
ValueCountFrequency (%)
chordata 454965
 
9.9%
animalia 454921
 
9.9%
vertebrata 454410
 
9.8%
osteichthyes 444515
 
9.6%
actinopterygii 444459
 
9.6%
neopterygii 444025
 
9.6%
acanthopterygii 293090
 
6.4%
perciformes 213808
 
4.6%
percoidei 96925
 
2.1%
ostariophysi 67590
 
1.5%
Other values (974) 1246012
27.0%
2025-01-08T17:57:22.784214image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 6690834
 
11.2%
e 5764651
 
9.6%
t 4862198
 
8.1%
a 4453200
 
7.4%
, 4159739
 
7.0%
4159739
 
7.0%
r 4156105
 
6.9%
o 3437162
 
5.7%
h 2157318
 
3.6%
n 2101258
 
3.5%
Other values (48) 17893866
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 46901857
78.4%
Uppercase Letter 4614713
 
7.7%
Other Punctuation 4159739
 
7.0%
Space Separator 4159739
 
7.0%
Decimal Number 22
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 6690834
14.3%
e 5764651
12.3%
t 4862198
10.4%
a 4453200
9.5%
r 4156105
8.9%
o 3437162
 
7.3%
h 2157318
 
4.6%
n 2101258
 
4.5%
y 1975345
 
4.2%
c 1930135
 
4.1%
Other values (16) 9373651
20.0%
Uppercase Letter
ValueCountFrequency (%)
A 1292509
28.0%
C 704712
15.3%
O 545269
11.8%
V 454484
 
9.8%
N 451678
 
9.8%
P 431310
 
9.3%
S 209834
 
4.5%
G 105660
 
2.3%
L 88680
 
1.9%
B 85568
 
1.9%
Other values (13) 245009
 
5.3%
Decimal Number
ValueCountFrequency (%)
5 6
27.3%
7 5
22.7%
8 4
18.2%
0 3
13.6%
3 2
 
9.1%
9 1
 
4.5%
1 1
 
4.5%
Other Punctuation
ValueCountFrequency (%)
, 4159739
100.0%
Space Separator
ValueCountFrequency (%)
4159739
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 51516570
86.1%
Common 8319500
 
13.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 6690834
13.0%
e 5764651
11.2%
t 4862198
 
9.4%
a 4453200
 
8.6%
r 4156105
 
8.1%
o 3437162
 
6.7%
h 2157318
 
4.2%
n 2101258
 
4.1%
y 1975345
 
3.8%
c 1930135
 
3.7%
Other values (39) 13988364
27.2%
Common
ValueCountFrequency (%)
, 4159739
50.0%
4159739
50.0%
5 6
 
< 0.1%
7 5
 
< 0.1%
8 4
 
< 0.1%
0 3
 
< 0.1%
3 2
 
< 0.1%
9 1
 
< 0.1%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 59836070
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 6690834
 
11.2%
e 5764651
 
9.6%
t 4862198
 
8.1%
a 4453200
 
7.4%
, 4159739
 
7.0%
4159739
 
7.0%
r 4156105
 
6.9%
o 3437162
 
5.7%
h 2157318
 
3.6%
n 2101258
 
3.5%
Other values (48) 17893866
29.9%
Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:22.838747image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length8
Mean length8.00267348
Min length4

Characters and Unicode

Total characters3642913
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 454998
99.9%
incertae 207
 
< 0.1%
sedis 207
 
< 0.1%
5153 1
 
< 0.1%
8535 1
 
< 0.1%
6880497 1
 
< 0.1%
8522 1
 
< 0.1%
4215 1
 
< 0.1%
4504 1
 
< 0.1%
5097 1
 
< 0.1%
2025-01-08T17:57:22.936193image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 910410
25.0%
a 910203
25.0%
n 455205
12.5%
A 454998
12.5%
m 454998
12.5%
l 454998
12.5%
e 621
 
< 0.1%
s 414
 
< 0.1%
r 207
 
< 0.1%
t 207
 
< 0.1%
Other values (13) 652
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3187677
87.5%
Uppercase Letter 454998
 
12.5%
Space Separator 207
 
< 0.1%
Decimal Number 31
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 910410
28.6%
a 910203
28.6%
n 455205
14.3%
m 454998
14.3%
l 454998
14.3%
e 621
 
< 0.1%
s 414
 
< 0.1%
r 207
 
< 0.1%
t 207
 
< 0.1%
c 207
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
5 8
25.8%
8 4
12.9%
4 4
12.9%
0 3
 
9.7%
2 3
 
9.7%
1 2
 
6.5%
3 2
 
6.5%
9 2
 
6.5%
7 2
 
6.5%
6 1
 
3.2%
Uppercase Letter
ValueCountFrequency (%)
A 454998
100.0%
Space Separator
ValueCountFrequency (%)
207
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3642675
> 99.9%
Common 238
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 910410
25.0%
a 910203
25.0%
n 455205
12.5%
A 454998
12.5%
m 454998
12.5%
l 454998
12.5%
e 621
 
< 0.1%
s 414
 
< 0.1%
r 207
 
< 0.1%
t 207
 
< 0.1%
Other values (2) 414
 
< 0.1%
Common
ValueCountFrequency (%)
207
87.0%
5 8
 
3.4%
8 4
 
1.7%
4 4
 
1.7%
0 3
 
1.3%
2 3
 
1.3%
1 2
 
0.8%
3 2
 
0.8%
9 2
 
0.8%
7 2
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3642913
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 910410
25.0%
a 910203
25.0%
n 455205
12.5%
A 454998
12.5%
m 454998
12.5%
l 454998
12.5%
e 621
 
< 0.1%
s 414
 
< 0.1%
r 207
 
< 0.1%
t 207
 
< 0.1%
Other values (13) 652
 
< 0.1%

phylum
Text

Distinct9
Distinct (%)< 0.1%
Missing285
Missing (%)0.1%
Memory size3.5 MiB
2025-01-08T17:57:22.980191image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length8
Mean length8.000015387
Min length7

Characters and Unicode

Total characters3639423
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowChordata
2nd rowChordata
3rd rowChordata
4th rowChordata
5th rowChordata
ValueCountFrequency (%)
chordata 454913
> 99.9%
arthropoda 7
 
< 0.1%
2341007 1
 
< 0.1%
2384450 1
 
< 0.1%
2353451 1
 
< 0.1%
2373062 1
 
< 0.1%
2414937 1
 
< 0.1%
2371535 1
 
< 0.1%
2335094 1
 
< 0.1%
2025-01-08T17:57:23.085929image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 909833
25.0%
o 454927
12.5%
r 454927
12.5%
h 454920
12.5%
d 454920
12.5%
t 454920
12.5%
C 454913
12.5%
3 11
 
< 0.1%
2 8
 
< 0.1%
p 7
 
< 0.1%
Other values (9) 37
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3184454
87.5%
Uppercase Letter 454920
 
12.5%
Decimal Number 49
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
5 6
12.2%
0 5
10.2%
1 4
 
8.2%
7 4
 
8.2%
9 2
 
4.1%
8 1
 
2.0%
6 1
 
2.0%
Lowercase Letter
ValueCountFrequency (%)
a 909833
28.6%
o 454927
14.3%
r 454927
14.3%
h 454920
14.3%
d 454920
14.3%
t 454920
14.3%
p 7
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
C 454913
> 99.9%
A 7
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3639374
> 99.9%
Common 49
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
5 6
12.2%
0 5
10.2%
1 4
 
8.2%
7 4
 
8.2%
9 2
 
4.1%
8 1
 
2.0%
6 1
 
2.0%
Latin
ValueCountFrequency (%)
a 909833
25.0%
o 454927
12.5%
r 454927
12.5%
h 454920
12.5%
d 454920
12.5%
t 454920
12.5%
C 454913
12.5%
p 7
 
< 0.1%
A 7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3639423
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 909833
25.0%
o 454927
12.5%
r 454927
12.5%
h 454920
12.5%
d 454920
12.5%
t 454920
12.5%
C 454913
12.5%
3 11
 
< 0.1%
2 8
 
< 0.1%
p 7
 
< 0.1%
Other values (9) 37
 
< 0.1%

class
Text

Missing 

Distinct9
Distinct (%)0.1%
Missing444746
Missing (%)97.7%
Memory size3.5 MiB
2025-01-08T17:57:23.138189image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length14
Mean length13.50496847
Min length6

Characters and Unicode

Total characters141343
Distinct characters27
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowElasmobranchii
2nd rowPetromyzonti
3rd rowPetromyzonti
4th rowElasmobranchii
5th rowElasmobranchii
ValueCountFrequency (%)
elasmobranchii 8825
84.3%
petromyzonti 565
 
5.4%
leptocardii 514
 
4.9%
holocephali 362
 
3.5%
myxini 150
 
1.4%
dipneusti 28
 
0.3%
coelacanthi 14
 
0.1%
arachnida 7
 
0.1%
amphibia 1
 
< 0.1%
2025-01-08T17:57:23.242601image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 19984
14.1%
a 18569
13.1%
o 11207
7.9%
r 9911
 
7.0%
c 9722
 
6.9%
n 9589
 
6.8%
l 9563
 
6.8%
m 9391
 
6.6%
h 9209
 
6.5%
s 8853
 
6.3%
Other values (17) 25345
17.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 130877
92.6%
Uppercase Letter 10466
 
7.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 19984
15.3%
a 18569
14.2%
o 11207
8.6%
r 9911
7.6%
c 9722
7.4%
n 9589
7.3%
l 9563
7.3%
m 9391
7.2%
h 9209
7.0%
s 8853
6.8%
Other values (9) 14879
11.4%
Uppercase Letter
ValueCountFrequency (%)
E 8825
84.3%
P 565
 
5.4%
L 514
 
4.9%
H 362
 
3.5%
M 150
 
1.4%
D 28
 
0.3%
C 14
 
0.1%
A 8
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 141343
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 19984
14.1%
a 18569
13.1%
o 11207
7.9%
r 9911
 
7.0%
c 9722
 
6.9%
n 9589
 
6.8%
l 9563
 
6.8%
m 9391
 
6.6%
h 9209
 
6.5%
s 8853
 
6.3%
Other values (17) 25345
17.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 141343
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 19984
14.1%
a 18569
13.1%
o 11207
7.9%
r 9911
 
7.0%
c 9722
 
6.9%
n 9589
 
6.8%
l 9563
 
6.8%
m 9391
 
6.6%
h 9209
 
6.5%
s 8853
 
6.3%
Other values (17) 25345
17.9%

order
Text

Distinct71
Distinct (%)< 0.1%
Missing1001
Missing (%)0.2%
Memory size3.5 MiB
2025-01-08T17:57:23.328282image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length19
Mean length12.46148376
Min length7

Characters and Unicode

Total characters5660143
Distinct characters48
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowAnguilliformes
2nd rowMugiliformes
3rd rowPerciformes
4th rowCypriniformes
5th rowPerciformes
ValueCountFrequency (%)
perciformes 212582
46.8%
cypriniformes 33752
 
7.4%
scorpaeniformes 17672
 
3.9%
characiformes 17478
 
3.8%
anguilliformes 17113
 
3.8%
siluriformes 14280
 
3.1%
myctophiformes 13708
 
3.0%
pleuronectiformes 12320
 
2.7%
stomiiformes 12085
 
2.7%
tetraodontiformes 10526
 
2.3%
Other values (61) 92695
20.4%
2025-01-08T17:57:23.473665image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 810127
14.3%
e 762624
13.5%
o 587988
10.4%
i 567492
10.0%
m 475972
8.4%
s 464456
8.2%
f 454197
8.0%
c 292652
 
5.2%
P 226159
 
4.0%
n 146439
 
2.6%
Other values (38) 872037
15.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5205890
92.0%
Uppercase Letter 454204
 
8.0%
Decimal Number 49
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 810127
15.6%
e 762624
14.6%
o 587988
11.3%
i 567492
10.9%
m 475972
9.1%
s 464456
8.9%
f 454197
8.7%
c 292652
 
5.6%
n 146439
 
2.8%
p 101430
 
1.9%
Other values (13) 542513
10.4%
Uppercase Letter
ValueCountFrequency (%)
P 226159
49.8%
C 72877
 
16.0%
S 56796
 
12.5%
A 29049
 
6.4%
B 17150
 
3.8%
M 16497
 
3.6%
T 10795
 
2.4%
G 9454
 
2.1%
O 7183
 
1.6%
L 4153
 
0.9%
Other values (5) 4091
 
0.9%
Decimal Number
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 5660094
> 99.9%
Common 49
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 810127
14.3%
e 762624
13.5%
o 587988
10.4%
i 567492
10.0%
m 475972
8.4%
s 464456
8.2%
f 454197
8.0%
c 292652
 
5.2%
P 226159
 
4.0%
n 146439
 
2.6%
Other values (28) 871988
15.4%
Common
ValueCountFrequency (%)
3 11
22.4%
2 8
16.3%
4 7
14.3%
6 4
 
8.2%
8 4
 
8.2%
5 4
 
8.2%
0 3
 
6.1%
7 3
 
6.1%
9 3
 
6.1%
1 2
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5660143
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 810127
14.3%
e 762624
13.5%
o 587988
10.4%
i 567492
10.0%
m 475972
8.4%
s 464456
8.2%
f 454197
8.0%
c 292652
 
5.2%
P 226159
 
4.0%
n 146439
 
2.6%
Other values (38) 872037
15.4%

superfamily
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:23.543064image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length18
Mean length18.28571429
Min length15

Characters and Unicode

Total characters128
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowNoturus nocturnus
2nd rowThalassoma lunare
3rd rowBrycon falcatus
4th rowPseudotropheus elongatus
5th rowHalieutaea brevicauda
ValueCountFrequency (%)
noturus 1
 
7.1%
nocturnus 1
 
7.1%
thalassoma 1
 
7.1%
lunare 1
 
7.1%
brycon 1
 
7.1%
falcatus 1
 
7.1%
pseudotropheus 1
 
7.1%
elongatus 1
 
7.1%
halieutaea 1
 
7.1%
brevicauda 1
 
7.1%
Other values (4) 4
28.6%
2025-01-08T17:57:23.655978image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 14
10.9%
a 14
10.9%
u 13
 
10.2%
o 9
 
7.0%
r 9
 
7.0%
e 9
 
7.0%
7
 
5.5%
c 7
 
5.5%
l 7
 
5.5%
t 6
 
4.7%
Other values (17) 33
25.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 114
89.1%
Space Separator 7
 
5.5%
Uppercase Letter 7
 
5.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 14
12.3%
a 14
12.3%
u 13
11.4%
o 9
7.9%
r 9
7.9%
e 9
7.9%
c 7
 
6.1%
l 7
 
6.1%
t 6
 
5.3%
n 5
 
4.4%
Other values (10) 21
18.4%
Uppercase Letter
ValueCountFrequency (%)
B 2
28.6%
H 1
14.3%
N 1
14.3%
P 1
14.3%
T 1
14.3%
S 1
14.3%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 121
94.5%
Common 7
 
5.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 14
11.6%
a 14
11.6%
u 13
10.7%
o 9
 
7.4%
r 9
 
7.4%
e 9
 
7.4%
c 7
 
5.8%
l 7
 
5.8%
t 6
 
5.0%
n 5
 
4.1%
Other values (16) 28
23.1%
Common
ValueCountFrequency (%)
7
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 128
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 14
10.9%
a 14
10.9%
u 13
 
10.2%
o 9
 
7.0%
r 9
 
7.0%
e 9
 
7.0%
7
 
5.5%
c 7
 
5.5%
l 7
 
5.5%
t 6
 
4.7%
Other values (17) 33
25.8%

family
Text

Distinct561
Distinct (%)0.1%
Missing833
Missing (%)0.2%
Memory size3.5 MiB
2025-01-08T17:57:23.798948image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length40
Median length36
Mean length10.79361062
Min length6

Characters and Unicode

Total characters4904390
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)< 0.1%

Sample

1st rowMuraenidae
2nd rowMugilidae
3rd rowGobiidae
4th rowCyprinidae
5th rowCentropomidae
ValueCountFrequency (%)
cyprinidae 27640
 
6.1%
gobiidae 26017
 
5.7%
pomacentridae 16208
 
3.6%
labridae 14638
 
3.2%
blenniidae 14508
 
3.2%
myctophidae 13553
 
3.0%
apogonidae 12381
 
2.7%
serranidae 11376
 
2.5%
characidae 9124
 
2.0%
stomiidae 7881
 
1.7%
Other values (575) 301078
66.3%
2025-01-08T17:57:24.019778image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 707945
14.4%
e 650924
13.3%
i 646771
13.2%
d 489722
10.0%
o 279193
 
5.7%
r 276574
 
5.6%
n 253779
 
5.2%
t 211871
 
4.3%
c 160423
 
3.3%
h 139416
 
2.8%
Other values (53) 1087772
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4449936
90.7%
Uppercase Letter 454388
 
9.3%
Decimal Number 28
 
< 0.1%
Space Separator 25
 
< 0.1%
Other Punctuation 9
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 707945
15.9%
e 650924
14.6%
i 646771
14.5%
d 489722
11.0%
o 279193
 
6.3%
r 276574
 
6.2%
n 253779
 
5.7%
t 211871
 
4.8%
c 160423
 
3.6%
h 139416
 
3.1%
Other values (17) 633318
14.2%
Uppercase Letter
ValueCountFrequency (%)
C 99617
21.9%
S 65857
14.5%
P 51368
11.3%
M 37729
 
8.3%
A 35150
 
7.7%
G 34853
 
7.7%
L 29900
 
6.6%
B 27207
 
6.0%
H 16335
 
3.6%
E 14399
 
3.2%
Other values (13) 41973
9.2%
Decimal Number
ValueCountFrequency (%)
1 9
32.1%
8 6
21.4%
4 4
14.3%
0 2
 
7.1%
5 2
 
7.1%
9 2
 
7.1%
6 2
 
7.1%
7 1
 
3.6%
Other Punctuation
ValueCountFrequency (%)
, 7
77.8%
& 2
 
22.2%
Space Separator
ValueCountFrequency (%)
25
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4904324
> 99.9%
Common 66
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 707945
14.4%
e 650924
13.3%
i 646771
13.2%
d 489722
10.0%
o 279193
 
5.7%
r 276574
 
5.6%
n 253779
 
5.2%
t 211871
 
4.3%
c 160423
 
3.3%
h 139416
 
2.8%
Other values (40) 1087706
22.2%
Common
ValueCountFrequency (%)
25
37.9%
1 9
 
13.6%
, 7
 
10.6%
8 6
 
9.1%
4 4
 
6.1%
0 2
 
3.0%
( 2
 
3.0%
) 2
 
3.0%
5 2
 
3.0%
9 2
 
3.0%
Other values (3) 5
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4904389
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 707945
14.4%
e 650924
13.3%
i 646771
13.2%
d 489722
10.0%
o 279193
 
5.7%
r 276574
 
5.6%
n 253779
 
5.2%
t 211871
 
4.3%
c 160423
 
3.3%
h 139416
 
2.8%
Other values (52) 1087771
22.2%
None
ValueCountFrequency (%)
ü 1
100.0%

subfamily
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:24.094605image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length18
Mean length18.28571429
Min length15

Characters and Unicode

Total characters128
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowNoturus nocturnus
2nd rowThalassoma lunare
3rd rowBrycon falcatus
4th rowPseudotropheus elongatus
5th rowHalieutaea brevicauda
ValueCountFrequency (%)
noturus 1
 
7.1%
nocturnus 1
 
7.1%
thalassoma 1
 
7.1%
lunare 1
 
7.1%
brycon 1
 
7.1%
falcatus 1
 
7.1%
pseudotropheus 1
 
7.1%
elongatus 1
 
7.1%
halieutaea 1
 
7.1%
brevicauda 1
 
7.1%
Other values (4) 4
28.6%
2025-01-08T17:57:24.203128image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 14
10.9%
a 14
10.9%
u 13
 
10.2%
o 9
 
7.0%
r 9
 
7.0%
e 9
 
7.0%
7
 
5.5%
c 7
 
5.5%
l 7
 
5.5%
t 6
 
4.7%
Other values (17) 33
25.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 114
89.1%
Space Separator 7
 
5.5%
Uppercase Letter 7
 
5.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 14
12.3%
a 14
12.3%
u 13
11.4%
o 9
7.9%
r 9
7.9%
e 9
7.9%
c 7
 
6.1%
l 7
 
6.1%
t 6
 
5.3%
n 5
 
4.4%
Other values (10) 21
18.4%
Uppercase Letter
ValueCountFrequency (%)
B 2
28.6%
H 1
14.3%
N 1
14.3%
P 1
14.3%
T 1
14.3%
S 1
14.3%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 121
94.5%
Common 7
 
5.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 14
11.6%
a 14
11.6%
u 13
10.7%
o 9
 
7.4%
r 9
 
7.4%
e 9
 
7.4%
c 7
 
5.8%
l 7
 
5.8%
t 6
 
5.0%
n 5
 
4.1%
Other values (16) 28
23.1%
Common
ValueCountFrequency (%)
7
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 128
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 14
10.9%
a 14
10.9%
u 13
 
10.2%
o 9
 
7.0%
r 9
 
7.0%
e 9
 
7.0%
7
 
5.5%
c 7
 
5.5%
l 7
 
5.5%
t 6
 
4.7%
Other values (17) 33
25.8%

subtribe
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455205
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:24.244883image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters21
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEML
2nd rowEML
3rd rowEML
4th rowEML
5th rowEML
ValueCountFrequency (%)
eml 7
100.0%
2025-01-08T17:57:24.328868image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 7
33.3%
M 7
33.3%
L 7
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 21
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 7
33.3%
M 7
33.3%
L 7
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 21
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 7
33.3%
M 7
33.3%
L 7
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 7
33.3%
M 7
33.3%
L 7
33.3%

genus
Text

Missing 

Distinct4427
Distinct (%)1.0%
Missing23586
Missing (%)5.2%
Memory size3.5 MiB
2025-01-08T17:57:24.509194image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length19
Mean length9.89925074
Min length3

Characters and Unicode

Total characters4272774
Distinct characters65
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique411 ?
Unique (%)0.1%

Sample

1st rowEchidna
2nd rowMugil
3rd rowMyersina
4th rowRhinichthys
5th rowCentropomus
ValueCountFrequency (%)
etheostoma 5026
 
1.2%
gymnothorax 4350
 
1.0%
lepomis 4347
 
1.0%
notropis 4334
 
1.0%
chaetodon 4239
 
1.0%
lutjanus 3825
 
0.9%
halichoeres 3118
 
0.7%
chromis 3031
 
0.7%
acanthurus 2923
 
0.7%
pomacentrus 2919
 
0.7%
Other values (4417) 393514
91.2%
2025-01-08T17:57:24.763152image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 399528
 
9.4%
s 397208
 
9.3%
a 333959
 
7.8%
i 300262
 
7.0%
e 279207
 
6.5%
r 261041
 
6.1%
u 247831
 
5.8%
t 244352
 
5.7%
n 224564
 
5.3%
h 210029
 
4.9%
Other values (55) 1374793
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3840987
89.9%
Uppercase Letter 431633
 
10.1%
Decimal Number 119
 
< 0.1%
Other Punctuation 21
 
< 0.1%
Dash Punctuation 14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 399528
10.4%
s 397208
10.3%
a 333959
 
8.7%
i 300262
 
7.8%
e 279207
 
7.3%
r 261041
 
6.8%
u 247831
 
6.5%
t 244352
 
6.4%
n 224564
 
5.8%
h 210029
 
5.5%
Other values (16) 943006
24.6%
Uppercase Letter
ValueCountFrequency (%)
C 61021
14.1%
P 51248
11.9%
S 48961
11.3%
A 40636
9.4%
E 28869
 
6.7%
L 26991
 
6.3%
M 24971
 
5.8%
H 24933
 
5.8%
N 18765
 
4.3%
G 18448
 
4.3%
Other values (16) 86790
20.1%
Decimal Number
ValueCountFrequency (%)
2 31
26.1%
0 17
14.3%
1 17
14.3%
4 12
 
10.1%
3 12
 
10.1%
5 10
 
8.4%
8 9
 
7.6%
6 5
 
4.2%
9 3
 
2.5%
7 3
 
2.5%
Other Punctuation
ValueCountFrequency (%)
: 14
66.7%
. 7
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4272620
> 99.9%
Common 154
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 399528
 
9.4%
s 397208
 
9.3%
a 333959
 
7.8%
i 300262
 
7.0%
e 279207
 
6.5%
r 261041
 
6.1%
u 247831
 
5.8%
t 244352
 
5.7%
n 224564
 
5.3%
h 210029
 
4.9%
Other values (42) 1374639
32.2%
Common
ValueCountFrequency (%)
2 31
20.1%
0 17
11.0%
1 17
11.0%
: 14
9.1%
- 14
9.1%
4 12
 
7.8%
3 12
 
7.8%
5 10
 
6.5%
8 9
 
5.8%
. 7
 
4.5%
Other values (3) 11
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4272774
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 399528
 
9.4%
s 397208
 
9.3%
a 333959
 
7.8%
i 300262
 
7.0%
e 279207
 
6.5%
r 261041
 
6.1%
u 247831
 
5.8%
t 244352
 
5.7%
n 224564
 
5.3%
h 210029
 
4.9%
Other values (55) 1374793
32.2%

genericName
Text

Missing 

Distinct5329
Distinct (%)1.2%
Missing23579
Missing (%)5.2%
Memory size3.5 MiB
2025-01-08T17:57:24.934723image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length19
Mean length9.850122674
Min length2

Characters and Unicode

Total characters4251638
Distinct characters62
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique744 ?
Unique (%)0.2%

Sample

1st rowEchidna
2nd rowMugil
3rd rowCryptocentrus
4th rowRhinichthys
5th rowCentropomus
ValueCountFrequency (%)
notropis 7148
 
1.7%
etheostoma 4849
 
1.1%
gymnothorax 4324
 
1.0%
lepomis 4276
 
1.0%
chaetodon 4249
 
1.0%
lutjanus 3807
 
0.9%
halichoeres 3126
 
0.7%
chromis 3122
 
0.7%
pomacentrus 2956
 
0.7%
acanthurus 2892
 
0.7%
Other values (5319) 390884
90.6%
2025-01-08T17:57:25.164003image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 401047
 
9.4%
s 398676
 
9.4%
a 333316
 
7.8%
i 299389
 
7.0%
e 276042
 
6.5%
r 259555
 
6.1%
t 246425
 
5.8%
u 244647
 
5.8%
n 220237
 
5.2%
h 207615
 
4.9%
Other values (52) 1364689
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3819844
89.8%
Uppercase Letter 431640
 
10.2%
Decimal Number 119
 
< 0.1%
Other Punctuation 21
 
< 0.1%
Dash Punctuation 14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 401047
10.5%
s 398676
10.4%
a 333316
 
8.7%
i 299389
 
7.8%
e 276042
 
7.2%
r 259555
 
6.8%
t 246425
 
6.5%
u 244647
 
6.4%
n 220237
 
5.8%
h 207615
 
5.4%
Other values (16) 932895
24.4%
Uppercase Letter
ValueCountFrequency (%)
C 59157
13.7%
P 51434
11.9%
S 48641
11.3%
A 41597
9.6%
E 28275
 
6.6%
L 26626
 
6.2%
H 25136
 
5.8%
M 25069
 
5.8%
N 20800
 
4.8%
G 18793
 
4.4%
Other values (16) 86112
19.9%
Decimal Number
ValueCountFrequency (%)
2 35
29.4%
1 28
23.5%
4 21
17.6%
0 14
 
11.8%
8 7
 
5.9%
3 7
 
5.9%
6 7
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 14
66.7%
. 7
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4251484
> 99.9%
Common 154
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 401047
 
9.4%
s 398676
 
9.4%
a 333316
 
7.8%
i 299389
 
7.0%
e 276042
 
6.5%
r 259555
 
6.1%
t 246425
 
5.8%
u 244647
 
5.8%
n 220237
 
5.2%
h 207615
 
4.9%
Other values (42) 1364535
32.1%
Common
ValueCountFrequency (%)
2 35
22.7%
1 28
18.2%
4 21
13.6%
0 14
 
9.1%
- 14
 
9.1%
: 14
 
9.1%
8 7
 
4.5%
3 7
 
4.5%
. 7
 
4.5%
6 7
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4251638
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 401047
 
9.4%
s 398676
 
9.4%
a 333316
 
7.8%
i 299389
 
7.0%
e 276042
 
6.5%
r 259555
 
6.1%
t 246425
 
5.8%
u 244647
 
5.8%
n 220237
 
5.2%
h 207615
 
4.9%
Other values (52) 1364689
32.1%

subgenus
Text

Missing 

Distinct2
Distinct (%)33.3%
Missing455206
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:25.213362image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.166666667
Min length4

Characters and Unicode

Total characters25
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)16.7%

Sample

1st rowfalse
2nd rowtrue
3rd rowtrue
4th rowtrue
5th rowtrue
ValueCountFrequency (%)
true 5
83.3%
false 1
 
16.7%
2025-01-08T17:57:25.302919image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 6
24.0%
t 5
20.0%
r 5
20.0%
u 5
20.0%
f 1
 
4.0%
a 1
 
4.0%
l 1
 
4.0%
s 1
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 6
24.0%
t 5
20.0%
r 5
20.0%
u 5
20.0%
f 1
 
4.0%
a 1
 
4.0%
l 1
 
4.0%
s 1
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 6
24.0%
t 5
20.0%
r 5
20.0%
u 5
20.0%
f 1
 
4.0%
a 1
 
4.0%
l 1
 
4.0%
s 1
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 6
24.0%
t 5
20.0%
r 5
20.0%
u 5
20.0%
f 1
 
4.0%
a 1
 
4.0%
l 1
 
4.0%
s 1
 
4.0%

specificEpithet
Text

Missing 

Distinct12528
Distinct (%)3.3%
Missing70259
Missing (%)15.4%
Memory size3.5 MiB
2025-01-08T17:57:25.483666image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length17
Mean length8.890235951
Min length2

Characters and Unicode

Total characters3422323
Distinct characters29
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2693 ?
Unique (%)0.7%

Sample

1st rownebulosa
2nd rowfilifer
3rd rowcataractae
4th rowensiferus
5th rowinferomaculata
ValueCountFrequency (%)
maculatus 1803
 
0.5%
fasciatus 1624
 
0.4%
lineatus 1573
 
0.4%
punctatus 1558
 
0.4%
affinis 1520
 
0.4%
ocellatus 1448
 
0.4%
nigricans 1438
 
0.4%
cornutus 1264
 
0.3%
notatus 1167
 
0.3%
niger 1160
 
0.3%
Other values (12518) 370398
96.2%
2025-01-08T17:57:25.739416image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 384962
11.2%
s 357791
10.5%
i 357450
10.4%
u 276235
 
8.1%
e 241281
 
7.1%
r 227543
 
6.6%
t 215016
 
6.3%
n 207432
 
6.1%
o 194229
 
5.7%
l 192527
 
5.6%
Other values (19) 767857
22.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3422237
> 99.9%
Dash Punctuation 86
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 384962
11.2%
s 357791
10.5%
i 357450
10.4%
u 276235
 
8.1%
e 241281
 
7.1%
r 227543
 
6.6%
t 215016
 
6.3%
n 207432
 
6.1%
o 194229
 
5.7%
l 192527
 
5.6%
Other values (18) 767771
22.4%
Dash Punctuation
ValueCountFrequency (%)
- 86
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3422237
> 99.9%
Common 86
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 384962
11.2%
s 357791
10.5%
i 357450
10.4%
u 276235
 
8.1%
e 241281
 
7.1%
r 227543
 
6.6%
t 215016
 
6.3%
n 207432
 
6.1%
o 194229
 
5.7%
l 192527
 
5.6%
Other values (18) 767771
22.4%
Common
ValueCountFrequency (%)
- 86
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3422320
> 99.9%
None 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 384962
11.2%
s 357791
10.5%
i 357450
10.4%
u 276235
 
8.1%
e 241281
 
7.1%
r 227543
 
6.6%
t 215016
 
6.3%
n 207432
 
6.1%
o 194229
 
5.7%
l 192527
 
5.6%
Other values (17) 767854
22.4%
None
ValueCountFrequency (%)
ü 2
66.7%
ö 1
33.3%

infraspecificEpithet
Text

Missing 

Distinct681
Distinct (%)8.3%
Missing447018
Missing (%)98.2%
Memory size3.5 MiB
2025-01-08T17:57:25.872988image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length16
Mean length8.942762997
Min length3

Characters and Unicode

Total characters73277
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique208 ?
Unique (%)2.5%

Sample

1st rowniloticus
2nd rowramosus
3rd rowvexillare
4th rowvermiculatus
5th rowexilicauda
ValueCountFrequency (%)
leptocephalus 303
 
3.7%
atromaculatus 222
 
2.7%
crocodilus 221
 
2.7%
atratulus 169
 
2.1%
vermiculatus 156
 
1.9%
ferox 145
 
1.8%
commersonnii 138
 
1.7%
interocularis 121
 
1.5%
purpurescens 120
 
1.5%
salmoides 114
 
1.4%
Other values (671) 6485
79.1%
2025-01-08T17:57:26.064355image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 8034
11.0%
a 7736
10.6%
i 6963
9.5%
u 6427
8.8%
r 5139
 
7.0%
e 5070
 
6.9%
o 5038
 
6.9%
l 4896
 
6.7%
c 4349
 
5.9%
t 4159
 
5.7%
Other values (17) 15466
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 73265
> 99.9%
Dash Punctuation 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 8034
11.0%
a 7736
10.6%
i 6963
9.5%
u 6427
8.8%
r 5139
 
7.0%
e 5070
 
6.9%
o 5038
 
6.9%
l 4896
 
6.7%
c 4349
 
5.9%
t 4159
 
5.7%
Other values (16) 15454
21.1%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 73265
> 99.9%
Common 12
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 8034
11.0%
a 7736
10.6%
i 6963
9.5%
u 6427
8.8%
r 5139
 
7.0%
e 5070
 
6.9%
o 5038
 
6.9%
l 4896
 
6.7%
c 4349
 
5.9%
t 4159
 
5.7%
Other values (16) 15454
21.1%
Common
ValueCountFrequency (%)
- 12
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73277
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 8034
11.0%
a 7736
10.6%
i 6963
9.5%
u 6427
8.8%
r 5139
 
7.0%
e 5070
 
6.9%
o 5038
 
6.9%
l 4896
 
6.7%
c 4349
 
5.9%
t 4159
 
5.7%
Other values (17) 15466
21.1%

cultivarEpithet
Text

Missing 

Distinct5
Distinct (%)83.3%
Missing455206
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:26.118299image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length8.166666667
Min length4

Characters and Unicode

Total characters49
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)66.7%

Sample

1st rowNORTH_AMERICA
2nd rowAFRICA
3rd rowLATIN_AMERICA
4th rowAFRICA
5th rowASIA
ValueCountFrequency (%)
africa 2
33.3%
north_america 1
16.7%
latin_america 1
16.7%
asia 1
16.7%
oceania 1
16.7%
2025-01-08T17:57:26.213074image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 13
26.5%
I 7
14.3%
R 5
 
10.2%
C 5
 
10.2%
N 3
 
6.1%
E 3
 
6.1%
F 2
 
4.1%
O 2
 
4.1%
T 2
 
4.1%
_ 2
 
4.1%
Other values (4) 5
 
10.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 47
95.9%
Connector Punctuation 2
 
4.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 13
27.7%
I 7
14.9%
R 5
 
10.6%
C 5
 
10.6%
N 3
 
6.4%
E 3
 
6.4%
F 2
 
4.3%
O 2
 
4.3%
T 2
 
4.3%
M 2
 
4.3%
Other values (3) 3
 
6.4%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 47
95.9%
Common 2
 
4.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 13
27.7%
I 7
14.9%
R 5
 
10.6%
C 5
 
10.6%
N 3
 
6.4%
E 3
 
6.4%
F 2
 
4.3%
O 2
 
4.3%
T 2
 
4.3%
M 2
 
4.3%
Other values (3) 3
 
6.4%
Common
ValueCountFrequency (%)
_ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 13
26.5%
I 7
14.3%
R 5
 
10.2%
C 5
 
10.2%
N 3
 
6.1%
E 3
 
6.1%
F 2
 
4.1%
O 2
 
4.1%
T 2
 
4.1%
_ 2
 
4.1%
Other values (4) 5
 
10.2%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:26.260454image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length6.796793582
Min length5

Characters and Unicode

Total characters3093982
Distinct characters22
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSPECIES
2nd rowGENUS
3rd rowSPECIES
4th rowSPECIES
5th rowSPECIES
ValueCountFrequency (%)
species 376767
82.8%
genus 46673
 
10.3%
family 22827
 
5.0%
subspecies 8175
 
1.8%
order 347
 
0.1%
kingdom 204
 
< 0.1%
phylum 198
 
< 0.1%
variety 12
 
< 0.1%
north_america 7
 
< 0.1%
class 2
 
< 0.1%
2025-01-08T17:57:26.366552image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 824736
26.7%
E 816923
26.4%
I 407992
13.2%
P 385140
12.4%
C 384951
12.4%
U 55046
 
1.8%
N 46884
 
1.5%
G 46877
 
1.5%
M 23236
 
0.8%
Y 23037
 
0.7%
Other values (12) 79160
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3093975
> 99.9%
Connector Punctuation 7
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 824736
26.7%
E 816923
26.4%
I 407992
13.2%
P 385140
12.4%
C 384951
12.4%
U 55046
 
1.8%
N 46884
 
1.5%
G 46877
 
1.5%
M 23236
 
0.8%
Y 23037
 
0.7%
Other values (11) 79153
 
2.6%
Connector Punctuation
ValueCountFrequency (%)
_ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3093975
> 99.9%
Common 7
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 824736
26.7%
E 816923
26.4%
I 407992
13.2%
P 385140
12.4%
C 384951
12.4%
U 55046
 
1.8%
N 46884
 
1.5%
G 46877
 
1.5%
M 23236
 
0.8%
Y 23037
 
0.7%
Other values (11) 79153
 
2.6%
Common
ValueCountFrequency (%)
_ 7
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3093982
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 824736
26.7%
E 816923
26.4%
I 407992
13.2%
P 385140
12.4%
C 384951
12.4%
U 55046
 
1.8%
N 46884
 
1.5%
G 46877
 
1.5%
M 23236
 
0.8%
Y 23037
 
0.7%
Other values (12) 79160
 
2.6%

verbatimTaxonRank
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing455210
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:26.406552image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowSYC
2nd rowPHL
ValueCountFrequency (%)
syc 1
50.0%
phl 1
50.0%
2025-01-08T17:57:26.491021image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 1
16.7%
Y 1
16.7%
C 1
16.7%
P 1
16.7%
H 1
16.7%
L 1
16.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1
16.7%
Y 1
16.7%
C 1
16.7%
P 1
16.7%
H 1
16.7%
L 1
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 6
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1
16.7%
Y 1
16.7%
C 1
16.7%
P 1
16.7%
H 1
16.7%
L 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1
16.7%
Y 1
16.7%
C 1
16.7%
P 1
16.7%
H 1
16.7%
L 1
16.7%

vernacularName
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing455210
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:26.537123image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length10.5
Mean length10.5
Min length10

Characters and Unicode

Total characters21
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowSeychelles
2nd rowPhilippines
ValueCountFrequency (%)
seychelles 1
50.0%
philippines 1
50.0%
2025-01-08T17:57:26.632875image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 4
19.0%
l 3
14.3%
i 3
14.3%
h 2
9.5%
s 2
9.5%
p 2
9.5%
S 1
 
4.8%
y 1
 
4.8%
c 1
 
4.8%
P 1
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19
90.5%
Uppercase Letter 2
 
9.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4
21.1%
l 3
15.8%
i 3
15.8%
h 2
10.5%
s 2
10.5%
p 2
10.5%
y 1
 
5.3%
c 1
 
5.3%
n 1
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
P 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4
19.0%
l 3
14.3%
i 3
14.3%
h 2
9.5%
s 2
9.5%
p 2
9.5%
S 1
 
4.8%
y 1
 
4.8%
c 1
 
4.8%
P 1
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4
19.0%
l 3
14.3%
i 3
14.3%
h 2
9.5%
s 2
9.5%
p 2
9.5%
S 1
 
4.8%
y 1
 
4.8%
c 1
 
4.8%
P 1
 
4.8%

nomenclaturalCode
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing455210
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:26.673380image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters16
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowSYC.20_1
2nd rowPHL.36_1
ValueCountFrequency (%)
syc.20_1 1
50.0%
phl.36_1 1
50.0%
2025-01-08T17:57:26.757199image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 2
12.5%
_ 2
12.5%
1 2
12.5%
S 1
 
6.2%
Y 1
 
6.2%
C 1
 
6.2%
2 1
 
6.2%
0 1
 
6.2%
P 1
 
6.2%
H 1
 
6.2%
Other values (3) 3
18.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
37.5%
Uppercase Letter 6
37.5%
Other Punctuation 2
 
12.5%
Connector Punctuation 2
 
12.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1
16.7%
Y 1
16.7%
C 1
16.7%
P 1
16.7%
H 1
16.7%
L 1
16.7%
Decimal Number
ValueCountFrequency (%)
1 2
33.3%
2 1
16.7%
0 1
16.7%
3 1
16.7%
6 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10
62.5%
Latin 6
37.5%

Most frequent character per script

Common
ValueCountFrequency (%)
. 2
20.0%
_ 2
20.0%
1 2
20.0%
2 1
10.0%
0 1
10.0%
3 1
10.0%
6 1
10.0%
Latin
ValueCountFrequency (%)
S 1
16.7%
Y 1
16.7%
C 1
16.7%
P 1
16.7%
H 1
16.7%
L 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 2
12.5%
_ 2
12.5%
1 2
12.5%
S 1
 
6.2%
Y 1
 
6.2%
C 1
 
6.2%
2 1
 
6.2%
0 1
 
6.2%
P 1
 
6.2%
H 1
 
6.2%
Other values (3) 3
18.8%
Distinct5
Distinct (%)< 0.1%
Missing209
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:26.800200image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8
Mean length7.910209383
Min length6

Characters and Unicode

Total characters3599169
Distinct characters28
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowACCEPTED
2nd rowACCEPTED
3rd rowSYNONYM
4th rowACCEPTED
5th rowACCEPTED
ValueCountFrequency (%)
accepted 413893
91.0%
synonym 40858
 
9.0%
doubtful 250
 
0.1%
outer 1
 
< 0.1%
islands 1
 
< 0.1%
iloilo 1
 
< 0.1%
2025-01-08T17:57:26.898316image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 827786
23.0%
C 827786
23.0%
T 414143
11.5%
D 414143
11.5%
A 413893
11.5%
P 413893
11.5%
Y 81716
 
2.3%
N 81716
 
2.3%
O 41109
 
1.1%
S 40858
 
1.1%
Other values (18) 42126
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3599153
> 99.9%
Lowercase Letter 15
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 827786
23.0%
C 827786
23.0%
T 414143
11.5%
D 414143
11.5%
A 413893
11.5%
P 413893
11.5%
Y 81716
 
2.3%
N 81716
 
2.3%
O 41109
 
1.1%
S 40858
 
1.1%
Other values (6) 42110
 
1.2%
Lowercase Letter
ValueCountFrequency (%)
l 3
20.0%
s 2
13.3%
o 2
13.3%
u 1
 
6.7%
t 1
 
6.7%
e 1
 
6.7%
r 1
 
6.7%
a 1
 
6.7%
n 1
 
6.7%
d 1
 
6.7%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3599168
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 827786
23.0%
C 827786
23.0%
T 414143
11.5%
D 414143
11.5%
A 413893
11.5%
P 413893
11.5%
Y 81716
 
2.3%
N 81716
 
2.3%
O 41109
 
1.1%
S 40858
 
1.1%
Other values (17) 42125
 
1.2%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3599169
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 827786
23.0%
C 827786
23.0%
T 414143
11.5%
D 414143
11.5%
A 413893
11.5%
P 413893
11.5%
Y 81716
 
2.3%
N 81716
 
2.3%
O 41109
 
1.1%
S 40858
 
1.1%
Other values (18) 42126
 
1.2%

nomenclaturalStatus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455211
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:26.940959image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11
Distinct characters9
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPHL.36.21_1
ValueCountFrequency (%)
phl.36.21_1 1
100.0%
2025-01-08T17:57:27.027414image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 2
18.2%
1 2
18.2%
P 1
9.1%
H 1
9.1%
L 1
9.1%
3 1
9.1%
6 1
9.1%
2 1
9.1%
_ 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5
45.5%
Uppercase Letter 3
27.3%
Other Punctuation 2
 
18.2%
Connector Punctuation 1
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
40.0%
3 1
20.0%
6 1
20.0%
2 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
H 1
33.3%
L 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
72.7%
Latin 3
 
27.3%

Most frequent character per script

Common
ValueCountFrequency (%)
. 2
25.0%
1 2
25.0%
3 1
12.5%
6 1
12.5%
2 1
12.5%
_ 1
12.5%
Latin
ValueCountFrequency (%)
P 1
33.3%
H 1
33.3%
L 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 2
18.2%
1 2
18.2%
P 1
9.1%
H 1
9.1%
L 1
9.1%
3 1
9.1%
6 1
9.1%
2 1
9.1%
_ 1
9.1%

taxonRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455211
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-08T17:57:27.066414image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11
Distinct characters8
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowIloilo City
ValueCountFrequency (%)
iloilo 1
50.0%
city 1
50.0%
2025-01-08T17:57:27.152853image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 2
18.2%
o 2
18.2%
i 2
18.2%
I 1
9.1%
1
9.1%
C 1
9.1%
t 1
9.1%
y 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8
72.7%
Uppercase Letter 2
 
18.2%
Space Separator 1
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2
25.0%
o 2
25.0%
i 2
25.0%
t 1
12.5%
y 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
I 1
50.0%
C 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
90.9%
Common 1
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2
20.0%
o 2
20.0%
i 2
20.0%
I 1
10.0%
C 1
10.0%
t 1
10.0%
y 1
10.0%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 2
18.2%
o 2
18.2%
i 2
18.2%
I 1
9.1%
1
9.1%
C 1
9.1%
t 1
9.1%
y 1
9.1%
Distinct2
Distinct (%)< 0.1%
Missing6
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:27.203676image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length35.99995167
Min length14

Characters and Unicode

Total characters16387394
Distinct characters21
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
2nd row821cc27a-e3bb-4bc5-ac34-89ada245069d
3rd row821cc27a-e3bb-4bc5-ac34-89ada245069d
4th row821cc27a-e3bb-4bc5-ac34-89ada245069d
5th row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 455205
> 99.9%
phl.36.21.66_1 1
 
< 0.1%
2025-01-08T17:57:27.304256image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 1820820
11.1%
a 1820820
11.1%
- 1820820
11.1%
2 1365616
8.3%
4 1365615
8.3%
b 1365615
8.3%
3 910411
 
5.6%
d 910410
 
5.6%
9 910410
 
5.6%
5 910410
 
5.6%
Other values (11) 3186447
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8193697
50.0%
Lowercase Letter 6372870
38.9%
Dash Punctuation 1820820
 
11.1%
Other Punctuation 3
 
< 0.1%
Uppercase Letter 3
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1365616
16.7%
4 1365615
16.7%
3 910411
11.1%
9 910410
11.1%
5 910410
11.1%
8 910410
11.1%
6 455208
 
5.6%
1 455207
 
5.6%
7 455205
 
5.6%
0 455205
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
c 1820820
28.6%
a 1820820
28.6%
b 1365615
21.4%
d 910410
14.3%
e 455205
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
H 1
33.3%
L 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 1820820
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10014521
61.1%
Latin 6372873
38.9%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1820820
18.2%
2 1365616
13.6%
4 1365615
13.6%
3 910411
9.1%
9 910410
9.1%
5 910410
9.1%
8 910410
9.1%
6 455208
 
4.5%
1 455207
 
4.5%
7 455205
 
4.5%
Other values (3) 455209
 
4.5%
Latin
ValueCountFrequency (%)
c 1820820
28.6%
a 1820820
28.6%
b 1365615
21.4%
d 910410
14.3%
e 455205
 
7.1%
P 1
 
< 0.1%
H 1
 
< 0.1%
L 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16387394
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 1820820
11.1%
a 1820820
11.1%
- 1820820
11.1%
2 1365616
8.3%
4 1365615
8.3%
b 1365615
8.3%
3 910411
 
5.6%
d 910410
 
5.6%
9 910410
 
5.6%
5 910410
 
5.6%
Other values (11) 3186447
19.4%
Distinct2
Distinct (%)< 0.1%
Missing6
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:27.342798image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.000015378
Min length2

Characters and Unicode

Total characters910419
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 455205
> 99.9%
kahirupan 1
 
< 0.1%
2025-01-08T17:57:27.433277image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 455205
50.0%
S 455205
50.0%
a 2
 
< 0.1%
K 1
 
< 0.1%
h 1
 
< 0.1%
i 1
 
< 0.1%
r 1
 
< 0.1%
u 1
 
< 0.1%
p 1
 
< 0.1%
n 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 910411
> 99.9%
Lowercase Letter 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
25.0%
h 1
12.5%
i 1
12.5%
r 1
12.5%
u 1
12.5%
p 1
12.5%
n 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
U 455205
50.0%
S 455205
50.0%
K 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 910419
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 455205
50.0%
S 455205
50.0%
a 2
 
< 0.1%
K 1
 
< 0.1%
h 1
 
< 0.1%
i 1
 
< 0.1%
r 1
 
< 0.1%
u 1
 
< 0.1%
p 1
 
< 0.1%
n 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 910419
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 455205
50.0%
S 455205
50.0%
a 2
 
< 0.1%
K 1
 
< 0.1%
h 1
 
< 0.1%
i 1
 
< 0.1%
r 1
 
< 0.1%
u 1
 
< 0.1%
p 1
 
< 0.1%
n 1
 
< 0.1%
Distinct173327
Distinct (%)38.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-08T17:57:27.564272image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99554933
Min length2

Characters and Unicode

Total characters10923062
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50647 ?
Unique (%)11.1%

Sample

1st row2024-12-02T13:56:09.099Z
2nd row2024-12-02T13:56:08.596Z
3rd row2024-12-02T13:59:51.375Z
4th row2024-12-02T13:58:24.571Z
5th row2024-12-02T13:56:08.212Z
ValueCountFrequency (%)
2024-12-02t13:57:53.333z 14
 
< 0.1%
2024-12-02t13:57:01.873z 14
 
< 0.1%
2024-12-02t13:57:04.016z 13
 
< 0.1%
2024-12-02t13:57:52.916z 13
 
< 0.1%
2024-12-02t13:57:28.109z 13
 
< 0.1%
2024-12-02t13:57:41.128z 13
 
< 0.1%
2024-12-02t13:58:01.465z 13
 
< 0.1%
2024-12-02t13:57:03.178z 13
 
< 0.1%
2024-12-02t13:57:30.416z 13
 
< 0.1%
2024-12-02t13:57:30.873z 12
 
< 0.1%
Other values (173317) 455081
> 99.9%
2025-01-08T17:57:27.867679image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2078433
19.0%
0 1154714
10.6%
1 1148157
10.5%
- 910410
8.3%
: 910410
8.3%
4 731701
 
6.7%
5 722652
 
6.6%
3 721318
 
6.6%
T 455206
 
4.2%
Z 455205
 
4.2%
Other values (10) 1634856
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7737081
70.8%
Other Punctuation 1365147
 
12.5%
Uppercase Letter 910424
 
8.3%
Dash Punctuation 910410
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2078433
26.9%
0 1154714
14.9%
1 1148157
14.8%
4 731701
 
9.5%
5 722652
 
9.3%
3 721318
 
9.3%
7 349813
 
4.5%
9 291367
 
3.8%
6 275056
 
3.6%
8 263870
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
T 455206
50.0%
Z 455205
50.0%
L 3
 
< 0.1%
C 3
 
< 0.1%
N 3
 
< 0.1%
E 2
 
< 0.1%
D 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 910410
66.7%
. 454737
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 910410
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10012638
91.7%
Latin 910424
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2078433
20.8%
0 1154714
11.5%
1 1148157
11.5%
- 910410
9.1%
: 910410
9.1%
4 731701
 
7.3%
5 722652
 
7.2%
3 721318
 
7.2%
. 454737
 
4.5%
7 349813
 
3.5%
Other values (3) 830293
 
8.3%
Latin
ValueCountFrequency (%)
T 455206
50.0%
Z 455205
50.0%
L 3
 
< 0.1%
C 3
 
< 0.1%
N 3
 
< 0.1%
E 2
 
< 0.1%
D 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10923062
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2078433
19.0%
0 1154714
10.6%
1 1148157
10.5%
- 910410
8.3%
: 910410
8.3%
4 731701
 
6.7%
5 722652
 
6.6%
3 721318
 
6.6%
T 455206
 
4.2%
Z 455205
 
4.2%
Other values (10) 1634856
15.0%

depth
Text

Missing 

Distinct3057
Distinct (%)1.5%
Missing246174
Missing (%)54.1%
Memory size3.5 MiB
2025-01-08T17:57:28.045621image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length18
Mean length3.877084549
Min length3

Characters and Unicode

Total characters810458
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique590 ?
Unique (%)0.3%

Sample

1st row49.5
2nd row2.3
3rd row41.5
4th row7.5
5th row3.5
ValueCountFrequency (%)
0.5 14691
 
7.0%
1.0 9522
 
4.6%
1.5 7688
 
3.7%
3.0 5171
 
2.5%
4.0 5073
 
2.4%
2.5 5043
 
2.4%
0.0 4521
 
2.2%
2.0 4468
 
2.1%
3.5 4405
 
2.1%
5.0 4020
 
1.9%
Other values (3047) 144436
69.1%
2025-01-08T17:57:28.276148image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 209038
25.8%
0 188741
23.3%
5 117616
14.5%
1 74706
 
9.2%
2 53659
 
6.6%
3 37477
 
4.6%
4 33701
 
4.2%
7 28173
 
3.5%
6 25895
 
3.2%
9 21572
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 601420
74.2%
Other Punctuation 209038
 
25.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 188741
31.4%
5 117616
19.6%
1 74706
 
12.4%
2 53659
 
8.9%
3 37477
 
6.2%
4 33701
 
5.6%
7 28173
 
4.7%
6 25895
 
4.3%
9 21572
 
3.6%
8 19880
 
3.3%
Other Punctuation
ValueCountFrequency (%)
. 209038
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 810458
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 209038
25.8%
0 188741
23.3%
5 117616
14.5%
1 74706
 
9.2%
2 53659
 
6.6%
3 37477
 
4.6%
4 33701
 
4.2%
7 28173
 
3.5%
6 25895
 
3.2%
9 21572
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 810458
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 209038
25.8%
0 188741
23.3%
5 117616
14.5%
1 74706
 
9.2%
2 53659
 
6.6%
3 37477
 
4.6%
4 33701
 
4.2%
7 28173
 
3.5%
6 25895
 
3.2%
9 21572
 
2.7%

depthAccuracy
Text

Missing 

Distinct1206
Distinct (%)0.6%
Missing266866
Missing (%)58.6%
Memory size3.5 MiB
2025-01-08T17:57:28.447864image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length3
Mean length3.532891593
Min length3

Characters and Unicode

Total characters665406
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique220 ?
Unique (%)0.1%

Sample

1st row10.5
2nd row2.3
3rd row4.5
4th row0.5
5th row1.5
ValueCountFrequency (%)
0.0 39321
20.9%
0.5 19670
 
10.4%
1.5 14947
 
7.9%
1.0 14430
 
7.7%
2.5 8360
 
4.4%
2.0 7760
 
4.1%
3.0 7373
 
3.9%
5.0 4367
 
2.3%
3.5 3758
 
2.0%
0.25 3396
 
1.8%
Other values (1196) 64964
34.5%
2025-01-08T17:57:28.676717image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 209096
31.4%
. 188346
28.3%
5 96270
14.5%
1 52038
 
7.8%
2 32769
 
4.9%
9 24929
 
3.7%
3 20005
 
3.0%
4 15882
 
2.4%
7 11418
 
1.7%
6 8839
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 477060
71.7%
Other Punctuation 188346
 
28.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 209096
43.8%
5 96270
20.2%
1 52038
 
10.9%
2 32769
 
6.9%
9 24929
 
5.2%
3 20005
 
4.2%
4 15882
 
3.3%
7 11418
 
2.4%
6 8839
 
1.9%
8 5814
 
1.2%
Other Punctuation
ValueCountFrequency (%)
. 188346
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 665406
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 209096
31.4%
. 188346
28.3%
5 96270
14.5%
1 52038
 
7.8%
2 32769
 
4.9%
9 24929
 
3.7%
3 20005
 
3.0%
4 15882
 
2.4%
7 11418
 
1.7%
6 8839
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 665406
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 209096
31.4%
. 188346
28.3%
5 96270
14.5%
1 52038
 
7.8%
2 32769
 
4.9%
9 24929
 
3.7%
3 20005
 
3.0%
4 15882
 
2.4%
7 11418
 
1.7%
6 8839
 
1.3%
Distinct43
Distinct (%)4.7%
Missing454306
Missing (%)99.8%
Memory size3.5 MiB
2025-01-08T17:57:28.762099image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length17.29028698
Min length16

Characters and Unicode

Total characters15665
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)1.4%

Sample

1st row3435.2993691323722
2nd row1914.9010623948639
3rd row3286.3383926848273
4th row4049.579332802943
5th row3435.2993691323722
ValueCountFrequency (%)
3997.886559051776 149
16.4%
1914.9010623948639 85
 
9.4%
4049.579332802943 75
 
8.3%
3435.2993691323722 74
 
8.2%
4315.889420844057 72
 
7.9%
3469.315853887778 51
 
5.6%
3286.3383926848273 50
 
5.5%
3413.2475218601576 44
 
4.9%
3868.839758506256 35
 
3.9%
4088.010727125954 28
 
3.1%
Other values (33) 243
26.8%
2025-01-08T17:57:28.910621image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 1953
12.5%
9 1907
12.2%
8 1651
10.5%
5 1468
9.4%
4 1453
9.3%
2 1411
9.0%
7 1405
9.0%
6 1213
7.7%
0 1158
7.4%
1 1140
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14759
94.2%
Other Punctuation 906
 
5.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 1953
13.2%
9 1907
12.9%
8 1651
11.2%
5 1468
9.9%
4 1453
9.8%
2 1411
9.6%
7 1405
9.5%
6 1213
8.2%
0 1158
7.8%
1 1140
7.7%
Other Punctuation
ValueCountFrequency (%)
. 906
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15665
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 1953
12.5%
9 1907
12.2%
8 1651
10.5%
5 1468
9.4%
4 1453
9.3%
2 1411
9.0%
7 1405
9.0%
6 1213
7.7%
0 1158
7.4%
1 1140
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15665
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 1953
12.5%
9 1907
12.2%
8 1651
10.5%
5 1468
9.4%
4 1453
9.3%
2 1411
9.0%
7 1405
9.0%
6 1213
7.7%
0 1158
7.4%
1 1140
7.3%

issue
Text

Distinct224
Distinct (%)< 0.1%
Missing15
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:28.980363image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length213
Median length208
Mean length86.91829032
Min length46

Characters and Unicode

Total characters39564945
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique58 ?
Unique (%)< 0.1%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
3rd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
5th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count 145701
32.0%
occurrence_status_inferred_from_individual_count;continent_derived_from_country;continent_invalid 90868
20.0%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_invalid 74492
16.4%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;continent_invalid 73665
16.2%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84 23814
 
5.2%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;geodetic_datum_invalid;continent_derived_from_coordinates;continent_invalid 5912
 
1.3%
occurrence_status_inferred_from_individual_count;country_derived_from_coordinates;geodetic_datum_assumed_wgs84;continent_invalid 4969
 
1.1%
occurrence_status_inferred_from_individual_count;country_coordinate_mismatch;geodetic_datum_assumed_wgs84;continent_invalid 4544
 
1.0%
occurrence_status_inferred_from_individual_count;taxon_match_higherrank 3432
 
0.8%
occurrence_status_inferred_from_individual_count;taxon_match_fuzzy 2582
 
0.6%
Other values (214) 25218
 
5.5%
2025-01-08T17:57:29.106577image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 3794873
9.6%
N 3679859
9.3%
E 3402744
 
8.6%
I 3339442
 
8.4%
T 2939907
 
7.4%
R 2888137
 
7.3%
D 2760112
 
7.0%
C 2717539
 
6.9%
O 2539725
 
6.4%
U 2340457
 
5.9%
Other values (18) 9162150
23.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 34671685
87.6%
Connector Punctuation 3794873
 
9.6%
Other Punctuation 696481
 
1.8%
Decimal Number 401906
 
1.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 3679859
10.6%
E 3402744
9.8%
I 3339442
9.6%
T 2939907
8.5%
R 2888137
8.3%
D 2760112
8.0%
C 2717539
7.8%
O 2539725
7.3%
U 2340457
 
6.8%
A 1753128
 
5.1%
Other values (14) 6310635
18.2%
Decimal Number
ValueCountFrequency (%)
8 200953
50.0%
4 200953
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3794873
100.0%
Other Punctuation
ValueCountFrequency (%)
; 696481
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34671685
87.6%
Common 4893260
 
12.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 3679859
10.6%
E 3402744
9.8%
I 3339442
9.6%
T 2939907
8.5%
R 2888137
8.3%
D 2760112
8.0%
C 2717539
7.8%
O 2539725
7.3%
U 2340457
 
6.8%
A 1753128
 
5.1%
Other values (14) 6310635
18.2%
Common
ValueCountFrequency (%)
_ 3794873
77.6%
; 696481
 
14.2%
8 200953
 
4.1%
4 200953
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39564945
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 3794873
9.6%
N 3679859
9.3%
E 3402744
 
8.6%
I 3339442
 
8.4%
T 2939907
 
7.4%
R 2888137
 
7.3%
D 2760112
 
7.0%
C 2717539
 
6.9%
O 2539725
 
6.4%
U 2340457
 
5.9%
Other values (18) 9162150
23.2%

mediaType
Text

Missing 

Distinct34
Distinct (%)< 0.1%
Missing363819
Missing (%)79.9%
Memory size3.5 MiB
2025-01-08T17:57:29.157892image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length659
Median length10
Mean length17.04920508
Min length10

Characters and Unicode

Total characters1558178
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowStillImage
2nd rowStillImage
3rd rowStillImage
4th rowStillImage
5th rowStillImage
ValueCountFrequency (%)
stillimage 60095
65.8%
stillimage;stillimage 16175
 
17.7%
stillimage;stillimage;stillimage 9136
 
10.0%
stillimage;stillimage;stillimage;stillimage 3567
 
3.9%
stillimage;stillimage;stillimage;stillimage;stillimage 1344
 
1.5%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 427
 
0.5%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 208
 
0.2%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 113
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 88
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 65
 
0.1%
Other values (24) 175
 
0.2%
2025-01-08T17:57:29.273282image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 299922
19.2%
S 149961
9.6%
t 149961
9.6%
i 149961
9.6%
I 149961
9.6%
m 149961
9.6%
a 149961
9.6%
g 149961
9.6%
e 149961
9.6%
; 58568
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1199688
77.0%
Uppercase Letter 299922
 
19.2%
Other Punctuation 58568
 
3.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 299922
25.0%
t 149961
12.5%
i 149961
12.5%
m 149961
12.5%
a 149961
12.5%
g 149961
12.5%
e 149961
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 149961
50.0%
I 149961
50.0%
Other Punctuation
ValueCountFrequency (%)
; 58568
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1499610
96.2%
Common 58568
 
3.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 299922
20.0%
S 149961
10.0%
t 149961
10.0%
i 149961
10.0%
I 149961
10.0%
m 149961
10.0%
a 149961
10.0%
g 149961
10.0%
e 149961
10.0%
Common
ValueCountFrequency (%)
; 58568
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1558178
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 299922
19.2%
S 149961
9.6%
t 149961
9.6%
i 149961
9.6%
I 149961
9.6%
m 149961
9.6%
a 149961
9.6%
g 149961
9.6%
e 149961
9.6%
; 58568
 
3.8%
Distinct2
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:29.315281image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.558539559
Min length4

Characters and Unicode

Total characters2075070
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 254250
55.9%
true 200955
44.1%
2025-01-08T17:57:29.405792image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 455205
21.9%
f 254250
12.3%
a 254250
12.3%
l 254250
12.3%
s 254250
12.3%
t 200955
9.7%
r 200955
9.7%
u 200955
9.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2075070
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 455205
21.9%
f 254250
12.3%
a 254250
12.3%
l 254250
12.3%
s 254250
12.3%
t 200955
9.7%
r 200955
9.7%
u 200955
9.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 2075070
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 455205
21.9%
f 254250
12.3%
a 254250
12.3%
l 254250
12.3%
s 254250
12.3%
t 200955
9.7%
r 200955
9.7%
u 200955
9.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2075070
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 455205
21.9%
f 254250
12.3%
a 254250
12.3%
l 254250
12.3%
s 254250
12.3%
t 200955
9.7%
r 200955
9.7%
u 200955
9.7%
Distinct2
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:29.446171image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.986522556
Min length4

Characters and Unicode

Total characters2269890
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 449070
98.7%
true 6135
 
1.3%
2025-01-08T17:57:29.535669image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 455205
20.1%
f 449070
19.8%
a 449070
19.8%
l 449070
19.8%
s 449070
19.8%
t 6135
 
0.3%
r 6135
 
0.3%
u 6135
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2269890
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 455205
20.1%
f 449070
19.8%
a 449070
19.8%
l 449070
19.8%
s 449070
19.8%
t 6135
 
0.3%
r 6135
 
0.3%
u 6135
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 2269890
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 455205
20.1%
f 449070
19.8%
a 449070
19.8%
l 449070
19.8%
s 449070
19.8%
t 6135
 
0.3%
r 6135
 
0.3%
u 6135
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2269890
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 455205
20.1%
f 449070
19.8%
a 449070
19.8%
l 449070
19.8%
s 449070
19.8%
t 6135
 
0.3%
r 6135
 
0.3%
u 6135
 
0.3%
Distinct28364
Distinct (%)6.2%
Missing7
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:29.723726image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.851581156
Min length1

Characters and Unicode

Total characters3118874
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8055 ?
Unique (%)1.8%

Sample

1st row5213106
2nd row7822511
3rd row5209002
4th row2359811
5th row2369651
ValueCountFrequency (%)
4274 1630
 
0.4%
2376138 1001
 
0.2%
2359014 971
 
0.2%
2360481 895
 
0.2%
2367736 889
 
0.2%
2361357 853
 
0.2%
2359823 815
 
0.2%
2358931 758
 
0.2%
2365441 757
 
0.2%
4253 730
 
0.2%
Other values (28354) 445906
98.0%
2025-01-08T17:57:29.982657image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 594777
19.1%
3 469007
15.0%
4 302909
9.7%
5 292779
9.4%
8 260314
8.3%
0 254319
8.2%
9 250457
8.0%
1 243983
7.8%
6 227063
 
7.3%
7 223266
 
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3118874
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 594777
19.1%
3 469007
15.0%
4 302909
9.7%
5 292779
9.4%
8 260314
8.3%
0 254319
8.2%
9 250457
8.0%
1 243983
7.8%
6 227063
 
7.3%
7 223266
 
7.2%

Most occurring scripts

ValueCountFrequency (%)
Common 3118874
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 594777
19.1%
3 469007
15.0%
4 302909
9.7%
5 292779
9.4%
8 260314
8.3%
0 254319
8.2%
9 250457
8.0%
1 243983
7.8%
6 227063
 
7.3%
7 223266
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3118874
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 594777
19.1%
3 469007
15.0%
4 302909
9.7%
5 292779
9.4%
8 260314
8.3%
0 254319
8.2%
9 250457
8.0%
1 243983
7.8%
6 227063
 
7.3%
7 223266
 
7.2%
Distinct22054
Distinct (%)4.8%
Missing211
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:30.178100image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.847697038
Min length2

Characters and Unicode

Total characters3115709
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4768 ?
Unique (%)1.0%

Sample

1st row5213106
2nd row7822511
3rd row5209001
4th row2359811
5th row2369651
ValueCountFrequency (%)
4274 1630
 
0.4%
2360481 1121
 
0.2%
2359014 1113
 
0.2%
2359823 1006
 
0.2%
2376138 1001
 
0.2%
2366967 904
 
0.2%
2367736 893
 
0.2%
2394503 857
 
0.2%
2361357 853
 
0.2%
2358931 760
 
0.2%
Other values (22044) 444863
97.8%
2025-01-08T17:57:30.427597image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 599111
19.2%
3 470277
15.1%
4 304204
9.8%
5 294417
9.4%
8 259388
8.3%
0 253770
8.1%
9 251072
8.1%
1 239346
 
7.7%
7 223101
 
7.2%
6 221023
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3115709
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 599111
19.2%
3 470277
15.1%
4 304204
9.8%
5 294417
9.4%
8 259388
8.3%
0 253770
8.1%
9 251072
8.1%
1 239346
 
7.7%
7 223101
 
7.2%
6 221023
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 3115709
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 599111
19.2%
3 470277
15.1%
4 304204
9.8%
5 294417
9.4%
8 259388
8.3%
0 253770
8.1%
9 251072
8.1%
1 239346
 
7.7%
7 223101
 
7.2%
6 221023
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3115709
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 599111
19.2%
3 470277
15.1%
4 304204
9.8%
5 294417
9.4%
8 259388
8.3%
0 253770
8.1%
9 251072
8.1%
1 239346
 
7.7%
7 223101
 
7.2%
6 221023
 
7.1%
Distinct2
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:30.478870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters455205
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 454998
> 99.9%
0 207
 
< 0.1%
2025-01-08T17:57:30.568120image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 454998
> 99.9%
0 207
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 455205
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 454998
> 99.9%
0 207
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 455205
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 454998
> 99.9%
0 207
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 455205
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 454998
> 99.9%
0 207
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing292
Missing (%)0.1%
Memory size3.5 MiB
2025-01-08T17:57:30.608120image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters909840
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row44
2nd row44
3rd row44
4th row44
5th row44
ValueCountFrequency (%)
44 454913
> 99.9%
54 7
 
< 0.1%
2025-01-08T17:57:30.692511image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 909833
> 99.9%
5 7
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 909840
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 909833
> 99.9%
5 7
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 909840
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 909833
> 99.9%
5 7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 909840
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 909833
> 99.9%
5 7
 
< 0.1%

classKey
Text

Missing 

Distinct9
Distinct (%)0.1%
Missing444746
Missing (%)97.7%
Memory size3.5 MiB
2025-01-08T17:57:30.736325image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.486432257
Min length3

Characters and Unicode

Total characters36489
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row121
2nd row11881065
3rd row11881065
4th row121
5th row121
ValueCountFrequency (%)
121 8825
84.3%
11881065 565
 
5.4%
7375758 514
 
4.9%
120 362
 
3.5%
119 150
 
1.4%
11500725 28
 
0.3%
11733052 14
 
0.1%
367 7
 
0.1%
131 1
 
< 0.1%
2025-01-08T17:57:30.833845image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 20093
55.1%
2 9229
25.3%
5 1663
 
4.6%
8 1644
 
4.5%
7 1591
 
4.4%
0 997
 
2.7%
6 572
 
1.6%
3 550
 
1.5%
9 150
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36489
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 20093
55.1%
2 9229
25.3%
5 1663
 
4.6%
8 1644
 
4.5%
7 1591
 
4.4%
0 997
 
2.7%
6 572
 
1.6%
3 550
 
1.5%
9 150
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Common 36489
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 20093
55.1%
2 9229
25.3%
5 1663
 
4.6%
8 1644
 
4.5%
7 1591
 
4.4%
0 997
 
2.7%
6 572
 
1.6%
3 550
 
1.5%
9 150
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36489
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 20093
55.1%
2 9229
25.3%
5 1663
 
4.6%
8 1644
 
4.5%
7 1591
 
4.4%
0 997
 
2.7%
6 572
 
1.6%
3 550
 
1.5%
9 150
 
0.4%
Distinct64
Distinct (%)< 0.1%
Missing1008
Missing (%)0.2%
Memory size3.5 MiB
2025-01-08T17:57:30.906845image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.16821076
Min length3

Characters and Unicode

Total characters1439014
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row495
2nd row1067
3rd row587
4th row1153
5th row587
ValueCountFrequency (%)
587 212582
46.8%
1153 33752
 
7.4%
590 17672
 
3.9%
537 17478
 
3.8%
495 17113
 
3.8%
708 14280
 
3.1%
1306 13708
 
3.0%
588 12320
 
2.7%
774 12085
 
2.7%
772 10526
 
2.3%
Other values (54) 92688
20.4%
2025-01-08T17:57:31.042477image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 341848
23.8%
7 323015
22.4%
8 297305
20.7%
1 117756
 
8.2%
3 98457
 
6.8%
9 78657
 
5.5%
4 74471
 
5.2%
0 65102
 
4.5%
6 28721
 
2.0%
2 13682
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1439014
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 341848
23.8%
7 323015
22.4%
8 297305
20.7%
1 117756
 
8.2%
3 98457
 
6.8%
9 78657
 
5.5%
4 74471
 
5.2%
0 65102
 
4.5%
6 28721
 
2.0%
2 13682
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1439014
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 341848
23.8%
7 323015
22.4%
8 297305
20.7%
1 117756
 
8.2%
3 98457
 
6.8%
9 78657
 
5.5%
4 74471
 
5.2%
0 65102
 
4.5%
6 28721
 
2.0%
2 13682
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1439014
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 341848
23.8%
7 323015
22.4%
8 297305
20.7%
1 117756
 
8.2%
3 98457
 
6.8%
9 78657
 
5.5%
4 74471
 
5.2%
0 65102
 
4.5%
6 28721
 
2.0%
2 13682
 
1.0%
Distinct554
Distinct (%)0.1%
Missing840
Missing (%)0.2%
Memory size3.5 MiB
2025-01-08T17:57:31.216001image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length4
Mean length4.022034808
Min length4

Characters and Unicode

Total characters1827500
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)< 0.1%

Sample

1st row2953
2nd row8473
3rd row4274
4th row7336
5th row4256
ValueCountFrequency (%)
7336 27640
 
6.1%
4274 26017
 
5.7%
4499 16208
 
3.6%
8535 14638
 
3.2%
4251 14508
 
3.2%
4217 13553
 
3.0%
4236 12381
 
2.7%
8597 11376
 
2.5%
7201 9124
 
2.0%
2225 7881
 
1.7%
Other values (544) 301046
66.3%
2025-01-08T17:57:31.457290image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 265025
14.5%
4 244389
13.4%
5 233139
12.8%
7 194357
10.6%
8 187507
10.3%
3 177749
9.7%
1 146364
8.0%
6 145909
8.0%
9 143794
7.9%
0 89267
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1827500
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 265025
14.5%
4 244389
13.4%
5 233139
12.8%
7 194357
10.6%
8 187507
10.3%
3 177749
9.7%
1 146364
8.0%
6 145909
8.0%
9 143794
7.9%
0 89267
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1827500
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 265025
14.5%
4 244389
13.4%
5 233139
12.8%
7 194357
10.6%
8 187507
10.3%
3 177749
9.7%
1 146364
8.0%
6 145909
8.0%
9 143794
7.9%
0 89267
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1827500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 265025
14.5%
4 244389
13.4%
5 233139
12.8%
7 194357
10.6%
8 187507
10.3%
3 177749
9.7%
1 146364
8.0%
6 145909
8.0%
9 143794
7.9%
0 89267
 
4.9%

genusKey
Text

Missing 

Distinct4426
Distinct (%)1.0%
Missing23593
Missing (%)5.2%
Memory size3.5 MiB
2025-01-08T17:57:31.660264image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.003530892
Min length7

Characters and Unicode

Total characters3022857
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique409 ?
Unique (%)0.1%

Sample

1st row2404224
2nd row7822511
3rd row2378400
4th row2359788
5th row2356959
ValueCountFrequency (%)
2382199 5026
 
1.2%
2403463 4350
 
1.0%
2394482 4347
 
1.0%
2362128 4334
 
1.0%
2369550 4239
 
1.0%
2356953 3825
 
0.9%
2381823 3118
 
0.7%
5962165 3031
 
0.7%
2379647 2923
 
0.7%
2380069 2919
 
0.7%
Other values (4416) 393507
91.2%
2025-01-08T17:57:31.902009image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 592252
19.6%
3 524642
17.4%
4 287118
9.5%
9 256376
8.5%
6 254775
8.4%
5 239254
7.9%
8 231988
 
7.7%
0 221607
 
7.3%
7 209793
 
6.9%
1 205052
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3022857
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 592252
19.6%
3 524642
17.4%
4 287118
9.5%
9 256376
8.5%
6 254775
8.4%
5 239254
7.9%
8 231988
 
7.7%
0 221607
 
7.3%
7 209793
 
6.9%
1 205052
 
6.8%

Most occurring scripts

ValueCountFrequency (%)
Common 3022857
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 592252
19.6%
3 524642
17.4%
4 287118
9.5%
9 256376
8.5%
6 254775
8.4%
5 239254
7.9%
8 231988
 
7.7%
0 221607
 
7.3%
7 209793
 
6.9%
1 205052
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3022857
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 592252
19.6%
3 524642
17.4%
4 287118
9.5%
9 256376
8.5%
6 254775
8.4%
5 239254
7.9%
8 231988
 
7.7%
0 221607
 
7.3%
7 209793
 
6.9%
1 205052
 
6.8%

speciesKey
Text

Missing 

Distinct19431
Distinct (%)5.0%
Missing70260
Missing (%)15.4%
Memory size3.5 MiB
2025-01-08T17:57:32.103578image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.002387311
Min length7

Characters and Unicode

Total characters2695583
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4217 ?
Unique (%)1.1%

Sample

1st row5213106
2nd row5209001
3rd row2359811
4th row2369651
5th row2403057
ValueCountFrequency (%)
2360481 1121
 
0.3%
2359014 1113
 
0.3%
2359823 1006
 
0.3%
2361357 943
 
0.2%
2365439 938
 
0.2%
2366967 904
 
0.2%
2367736 893
 
0.2%
2394503 857
 
0.2%
2365441 760
 
0.2%
2358931 760
 
0.2%
Other values (19421) 375657
97.6%
2025-01-08T17:57:32.353321image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 523356
19.4%
3 406179
15.1%
4 259302
9.6%
5 256480
9.5%
0 226170
8.4%
8 225603
8.4%
9 216295
8.0%
1 207921
 
7.7%
6 188342
 
7.0%
7 185935
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2695583
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 523356
19.4%
3 406179
15.1%
4 259302
9.6%
5 256480
9.5%
0 226170
8.4%
8 225603
8.4%
9 216295
8.0%
1 207921
 
7.7%
6 188342
 
7.0%
7 185935
 
6.9%

Most occurring scripts

ValueCountFrequency (%)
Common 2695583
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 523356
19.4%
3 406179
15.1%
4 259302
9.6%
5 256480
9.5%
0 226170
8.4%
8 225603
8.4%
9 216295
8.0%
1 207921
 
7.7%
6 188342
 
7.0%
7 185935
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2695583
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 523356
19.4%
3 406179
15.1%
4 259302
9.6%
5 256480
9.5%
0 226170
8.4%
8 225603
8.4%
9 216295
8.0%
1 207921
 
7.7%
6 188342
 
7.0%
7 185935
 
6.9%

species
Text

Missing 

Distinct19429
Distinct (%)5.0%
Missing70260
Missing (%)15.4%
Memory size3.5 MiB
2025-01-08T17:57:32.553954image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length31
Mean length19.79614082
Min length8

Characters and Unicode

Total characters7620564
Distinct characters54
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4217 ?
Unique (%)1.1%

Sample

1st rowEchidna nebulosa
2nd rowMyersina filifer
3rd rowRhinichthys cataractae
4th rowCentropomus ensiferus
5th rowGorgasia inferomaculata
ValueCountFrequency (%)
etheostoma 4924
 
0.6%
chaetodon 4170
 
0.5%
notropis 4110
 
0.5%
lepomis 4038
 
0.5%
gymnothorax 4025
 
0.5%
lutjanus 3782
 
0.5%
chromis 2870
 
0.4%
halichoeres 2861
 
0.4%
synodus 2680
 
0.3%
acanthurus 2550
 
0.3%
Other values (15283) 733894
95.3%
2025-01-08T17:57:32.806041image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 711716
 
9.3%
a 681062
 
8.9%
i 624744
 
8.2%
o 551041
 
7.2%
u 497264
 
6.5%
e 491012
 
6.4%
r 459068
 
6.0%
t 435608
 
5.7%
n 408570
 
5.4%
384952
 
5.1%
Other values (44) 2375527
31.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6850578
89.9%
Space Separator 384952
 
5.1%
Uppercase Letter 384952
 
5.1%
Dash Punctuation 82
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 711716
10.4%
a 681062
9.9%
i 624744
 
9.1%
o 551041
 
8.0%
u 497264
 
7.3%
e 491012
 
7.2%
r 459068
 
6.7%
t 435608
 
6.4%
n 408570
 
6.0%
l 337102
 
4.9%
Other values (16) 1653391
24.1%
Uppercase Letter
ValueCountFrequency (%)
C 55562
14.4%
P 46179
12.0%
S 43666
11.3%
A 35670
9.3%
E 25750
 
6.7%
L 24627
 
6.4%
H 21929
 
5.7%
M 21577
 
5.6%
N 17672
 
4.6%
G 15987
 
4.2%
Other values (16) 76333
19.8%
Space Separator
ValueCountFrequency (%)
384952
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 82
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7235530
94.9%
Common 385034
 
5.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 711716
 
9.8%
a 681062
 
9.4%
i 624744
 
8.6%
o 551041
 
7.6%
u 497264
 
6.9%
e 491012
 
6.8%
r 459068
 
6.3%
t 435608
 
6.0%
n 408570
 
5.6%
l 337102
 
4.7%
Other values (42) 2038343
28.2%
Common
ValueCountFrequency (%)
384952
> 99.9%
- 82
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7620564
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 711716
 
9.3%
a 681062
 
8.9%
i 624744
 
8.2%
o 551041
 
7.2%
u 497264
 
6.5%
e 491012
 
6.4%
r 459068
 
6.0%
t 435608
 
5.7%
n 408570
 
5.4%
384952
 
5.1%
Other values (44) 2375527
31.2%
Distinct22054
Distinct (%)4.8%
Missing211
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:32.987261image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length111
Median length88
Mean length34.4021156
Min length7

Characters and Unicode

Total characters15652997
Distinct characters90
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4768 ?
Unique (%)1.0%

Sample

1st rowEchidna nebulosa (Ahl, 1789)
2nd rowMugil Linnaeus, 1758
3rd rowMyersina filifer (Valenciennes, 1837)
4th rowRhinichthys cataractae (Valenciennes, 1842)
5th rowCentropomus ensiferus Poey, 1860
ValueCountFrequency (%)
73674
 
4.0%
linnaeus 27502
 
1.5%
bleeker 24276
 
1.3%
1758 21716
 
1.2%
valenciennes 20911
 
1.1%
cuvier 18805
 
1.0%
bloch 16076
 
0.9%
jordan 16047
 
0.9%
lacepède 14474
 
0.8%
günther 13679
 
0.7%
Other values (18045) 1608922
86.7%
2025-01-08T17:57:33.240504image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1401081
 
9.0%
e 1049108
 
6.7%
a 1030592
 
6.6%
i 924453
 
5.9%
s 912612
 
5.8%
n 776988
 
5.0%
r 763435
 
4.9%
o 760269
 
4.9%
u 639070
 
4.1%
l 589065
 
3.8%
Other values (80) 6806324
43.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10559704
67.5%
Decimal Number 1720212
 
11.0%
Space Separator 1401081
 
9.0%
Uppercase Letter 968743
 
6.2%
Other Punctuation 507195
 
3.2%
Open Punctuation 246790
 
1.6%
Close Punctuation 246790
 
1.6%
Dash Punctuation 2482
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1049108
9.9%
a 1030592
9.8%
i 924453
 
8.8%
s 912612
 
8.6%
n 776988
 
7.4%
r 763435
 
7.2%
o 760269
 
7.2%
u 639070
 
6.1%
l 589065
 
5.6%
t 574724
 
5.4%
Other values (34) 2539388
24.0%
Uppercase Letter
ValueCountFrequency (%)
C 101740
10.5%
S 98482
10.2%
B 89774
 
9.3%
L 87106
 
9.0%
G 84761
 
8.7%
P 66896
 
6.9%
A 53025
 
5.5%
R 50566
 
5.2%
M 48486
 
5.0%
E 42815
 
4.4%
Other values (18) 245092
25.3%
Decimal Number
ValueCountFrequency (%)
1 503738
29.3%
8 362257
21.1%
9 175589
 
10.2%
7 134109
 
7.8%
5 112293
 
6.5%
0 107320
 
6.2%
2 89202
 
5.2%
6 84607
 
4.9%
3 84274
 
4.9%
4 66823
 
3.9%
Other Punctuation
ValueCountFrequency (%)
, 433183
85.4%
& 73674
 
14.5%
. 263
 
0.1%
' 75
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1401081
100.0%
Open Punctuation
ValueCountFrequency (%)
( 246790
100.0%
Close Punctuation
ValueCountFrequency (%)
) 246790
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2482
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11528447
73.7%
Common 4124550
 
26.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1049108
 
9.1%
a 1030592
 
8.9%
i 924453
 
8.0%
s 912612
 
7.9%
n 776988
 
6.7%
r 763435
 
6.6%
o 760269
 
6.6%
u 639070
 
5.5%
l 589065
 
5.1%
t 574724
 
5.0%
Other values (62) 3508131
30.4%
Common
ValueCountFrequency (%)
1401081
34.0%
1 503738
 
12.2%
, 433183
 
10.5%
8 362257
 
8.8%
( 246790
 
6.0%
) 246790
 
6.0%
9 175589
 
4.3%
7 134109
 
3.3%
5 112293
 
2.7%
0 107320
 
2.6%
Other values (8) 401400
 
9.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15595063
99.6%
None 57934
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1401081
 
9.0%
e 1049108
 
6.7%
a 1030592
 
6.6%
i 924453
 
5.9%
s 912612
 
5.9%
n 776988
 
5.0%
r 763435
 
4.9%
o 760269
 
4.9%
u 639070
 
4.1%
l 589065
 
3.8%
Other values (60) 6748390
43.3%
None
ValueCountFrequency (%)
ü 25216
43.5%
è 14495
25.0%
å 11815
20.4%
ö 2996
 
5.2%
é 2103
 
3.6%
ø 575
 
1.0%
á 277
 
0.5%
ó 158
 
0.3%
ă 111
 
0.2%
ç 56
 
0.1%
Other values (10) 132
 
0.2%
Distinct30202
Distinct (%)6.6%
Missing14
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:33.430573image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length54
Mean length18.57410402
Min length2

Characters and Unicode

Total characters8454895
Distinct characters70
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9327 ?
Unique (%)2.0%

Sample

1st rowEchidna nebulosa
2nd rowMugil
3rd rowCryptocentrus filifer
4th rowRhinichthys cataractae
5th rowCentropomus ensiferus
ValueCountFrequency (%)
notropis 7207
 
0.8%
etheostoma 4890
 
0.6%
chaetodon 4339
 
0.5%
gymnothorax 4324
 
0.5%
lepomis 4273
 
0.5%
lutjanus 3888
 
0.5%
chromis 3151
 
0.4%
halichoeres 3126
 
0.4%
pomacentrus 2957
 
0.3%
acanthurus 2893
 
0.3%
Other values (18882) 813929
95.2%
2025-01-08T17:57:33.697209image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 773325
 
9.1%
a 769363
 
9.1%
i 700851
 
8.3%
o 618984
 
7.3%
e 559438
 
6.6%
u 535543
 
6.3%
r 508516
 
6.0%
t 479216
 
5.7%
n 446288
 
5.3%
399779
 
4.7%
Other values (60) 2663592
31.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7598796
89.9%
Uppercase Letter 455383
 
5.4%
Space Separator 399779
 
4.7%
Close Punctuation 292
 
< 0.1%
Open Punctuation 292
 
< 0.1%
Other Punctuation 152
 
< 0.1%
Dash Punctuation 111
 
< 0.1%
Decimal Number 90
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 773325
10.2%
a 769363
10.1%
i 700851
 
9.2%
o 618984
 
8.1%
e 559438
 
7.4%
u 535543
 
7.0%
r 508516
 
6.7%
t 479216
 
6.3%
n 446288
 
5.9%
l 365536
 
4.8%
Other values (16) 1841736
24.2%
Uppercase Letter
ValueCountFrequency (%)
C 63044
13.8%
P 53484
11.7%
S 52488
11.5%
A 43659
9.6%
E 29489
 
6.5%
L 27979
 
6.1%
M 27366
 
6.0%
H 25637
 
5.6%
G 21345
 
4.7%
N 21144
 
4.6%
Other values (16) 89748
19.7%
Decimal Number
ValueCountFrequency (%)
1 39
43.3%
2 25
27.8%
3 15
 
16.7%
4 3
 
3.3%
6 3
 
3.3%
9 2
 
2.2%
5 2
 
2.2%
7 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 84
55.3%
/ 41
27.0%
? 11
 
7.2%
& 8
 
5.3%
5
 
3.3%
# 3
 
2.0%
Space Separator
ValueCountFrequency (%)
399779
100.0%
Close Punctuation
ValueCountFrequency (%)
) 292
100.0%
Open Punctuation
ValueCountFrequency (%)
( 292
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 111
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8054179
95.3%
Common 400716
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 773325
 
9.6%
a 769363
 
9.6%
i 700851
 
8.7%
o 618984
 
7.7%
e 559438
 
6.9%
u 535543
 
6.6%
r 508516
 
6.3%
t 479216
 
5.9%
n 446288
 
5.5%
l 365536
 
4.5%
Other values (42) 2297119
28.5%
Common
ValueCountFrequency (%)
399779
99.8%
) 292
 
0.1%
( 292
 
0.1%
- 111
 
< 0.1%
. 84
 
< 0.1%
/ 41
 
< 0.1%
1 39
 
< 0.1%
2 25
 
< 0.1%
3 15
 
< 0.1%
? 11
 
< 0.1%
Other values (8) 27
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8454890
> 99.9%
Punctuation 5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 773325
 
9.1%
a 769363
 
9.1%
i 700851
 
8.3%
o 618984
 
7.3%
e 559438
 
6.6%
u 535543
 
6.3%
r 508516
 
6.0%
t 479216
 
5.7%
n 446288
 
5.3%
399779
 
4.7%
Other values (59) 2663587
31.5%
Punctuation
ValueCountFrequency (%)
5
100.0%

protocol
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:33.747275image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1365615
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEML
2nd rowEML
3rd rowEML
4th rowEML
5th rowEML
ValueCountFrequency (%)
eml 455205
100.0%
2025-01-08T17:57:33.830795image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 455205
33.3%
M 455205
33.3%
L 455205
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1365615
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 455205
33.3%
M 455205
33.3%
L 455205
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1365615
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 455205
33.3%
M 455205
33.3%
L 455205
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1365615
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 455205
33.3%
M 455205
33.3%
L 455205
33.3%
Distinct173323
Distinct (%)38.1%
Missing7
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:33.956725image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99588757
Min length20

Characters and Unicode

Total characters10923048
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50645 ?
Unique (%)11.1%

Sample

1st row2024-12-02T13:56:09.099Z
2nd row2024-12-02T13:56:08.596Z
3rd row2024-12-02T13:59:51.375Z
4th row2024-12-02T13:58:24.571Z
5th row2024-12-02T13:56:08.212Z
ValueCountFrequency (%)
2024-12-02t13:57:53.333z 14
 
< 0.1%
2024-12-02t13:57:01.873z 14
 
< 0.1%
2024-12-02t13:57:03.178z 13
 
< 0.1%
2024-12-02t13:57:41.128z 13
 
< 0.1%
2024-12-02t13:57:28.109z 13
 
< 0.1%
2024-12-02t13:57:52.916z 13
 
< 0.1%
2024-12-02t13:57:04.016z 13
 
< 0.1%
2024-12-02t13:57:30.416z 13
 
< 0.1%
2024-12-02t13:58:01.465z 13
 
< 0.1%
2024-12-02t13:57:21.641z 12
 
< 0.1%
Other values (173313) 455074
> 99.9%
2025-01-08T17:57:34.147554image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2078433
19.0%
0 1154714
10.6%
1 1148157
10.5%
- 910410
8.3%
: 910410
8.3%
4 731701
 
6.7%
5 722652
 
6.6%
3 721318
 
6.6%
T 455205
 
4.2%
Z 455205
 
4.2%
Other values (5) 1634843
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7737081
70.8%
Other Punctuation 1365147
 
12.5%
Dash Punctuation 910410
 
8.3%
Uppercase Letter 910410
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2078433
26.9%
0 1154714
14.9%
1 1148157
14.8%
4 731701
 
9.5%
5 722652
 
9.3%
3 721318
 
9.3%
7 349813
 
4.5%
9 291367
 
3.8%
6 275056
 
3.6%
8 263870
 
3.4%
Other Punctuation
ValueCountFrequency (%)
: 910410
66.7%
. 454737
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 455205
50.0%
Z 455205
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 910410
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10012638
91.7%
Latin 910410
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2078433
20.8%
0 1154714
11.5%
1 1148157
11.5%
- 910410
9.1%
: 910410
9.1%
4 731701
 
7.3%
5 722652
 
7.2%
3 721318
 
7.2%
. 454737
 
4.5%
7 349813
 
3.5%
Other values (3) 830293
 
8.3%
Latin
ValueCountFrequency (%)
T 455205
50.0%
Z 455205
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10923048
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2078433
19.0%
0 1154714
10.6%
1 1148157
10.5%
- 910410
8.3%
: 910410
8.3%
4 731701
 
6.7%
5 722652
 
6.6%
3 721318
 
6.6%
T 455205
 
4.2%
Z 455205
 
4.2%
Other values (5) 1634843
15.0%

lastCrawled
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:34.202174image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters10924920
Distinct characters12
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2024-12-02T11:48:23.416Z
2nd row2024-12-02T11:48:23.416Z
3rd row2024-12-02T11:48:23.416Z
4th row2024-12-02T11:48:23.416Z
5th row2024-12-02T11:48:23.416Z
ValueCountFrequency (%)
2024-12-02t11:48:23.416z 455205
100.0%
2025-01-08T17:57:34.293916image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2276025
20.8%
1 1820820
16.7%
4 1365615
12.5%
0 910410
 
8.3%
- 910410
 
8.3%
: 910410
 
8.3%
T 455205
 
4.2%
8 455205
 
4.2%
3 455205
 
4.2%
. 455205
 
4.2%
Other values (2) 910410
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7738485
70.8%
Other Punctuation 1365615
 
12.5%
Dash Punctuation 910410
 
8.3%
Uppercase Letter 910410
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2276025
29.4%
1 1820820
23.5%
4 1365615
17.6%
0 910410
 
11.8%
8 455205
 
5.9%
3 455205
 
5.9%
6 455205
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 910410
66.7%
. 455205
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 455205
50.0%
Z 455205
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 910410
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10014510
91.7%
Latin 910410
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2276025
22.7%
1 1820820
18.2%
4 1365615
13.6%
0 910410
 
9.1%
- 910410
 
9.1%
: 910410
 
9.1%
8 455205
 
4.5%
3 455205
 
4.5%
. 455205
 
4.5%
6 455205
 
4.5%
Latin
ValueCountFrequency (%)
T 455205
50.0%
Z 455205
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10924920
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2276025
20.8%
1 1820820
16.7%
4 1365615
12.5%
0 910410
 
8.3%
- 910410
 
8.3%
: 910410
 
8.3%
T 455205
 
4.2%
8 455205
 
4.2%
3 455205
 
4.2%
. 455205
 
4.2%
Other values (2) 910410
 
8.3%

repatriated
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing30397
Missing (%)6.7%
Memory size3.5 MiB
2025-01-08T17:57:34.331677image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.293219401
Min length4

Characters and Unicode

Total characters1823824
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowtrue
4th rowfalse
5th rowtrue
ValueCountFrequency (%)
true 300251
70.7%
false 124564
29.3%
2025-01-08T17:57:34.422900image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 424815
23.3%
t 300251
16.5%
r 300251
16.5%
u 300251
16.5%
f 124564
 
6.8%
a 124564
 
6.8%
l 124564
 
6.8%
s 124564
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1823824
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 424815
23.3%
t 300251
16.5%
r 300251
16.5%
u 300251
16.5%
f 124564
 
6.8%
a 124564
 
6.8%
l 124564
 
6.8%
s 124564
 
6.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 1823824
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 424815
23.3%
t 300251
16.5%
r 300251
16.5%
u 300251
16.5%
f 124564
 
6.8%
a 124564
 
6.8%
l 124564
 
6.8%
s 124564
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1823824
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 424815
23.3%
t 300251
16.5%
r 300251
16.5%
u 300251
16.5%
f 124564
 
6.8%
a 124564
 
6.8%
l 124564
 
6.8%
s 124564
 
6.8%
Distinct2
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:34.460901image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.999011434
Min length4

Characters and Unicode

Total characters2275575
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 454755
99.9%
true 450
 
0.1%
2025-01-08T17:57:34.552364image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 455205
20.0%
f 454755
20.0%
a 454755
20.0%
l 454755
20.0%
s 454755
20.0%
t 450
 
< 0.1%
r 450
 
< 0.1%
u 450
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2275575
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 455205
20.0%
f 454755
20.0%
a 454755
20.0%
l 454755
20.0%
s 454755
20.0%
t 450
 
< 0.1%
r 450
 
< 0.1%
u 450
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 2275575
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 455205
20.0%
f 454755
20.0%
a 454755
20.0%
l 454755
20.0%
s 454755
20.0%
t 450
 
< 0.1%
r 450
 
< 0.1%
u 450
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2275575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 455205
20.0%
f 454755
20.0%
a 454755
20.0%
l 454755
20.0%
s 454755
20.0%
t 450
 
< 0.1%
r 450
 
< 0.1%
u 450
 
< 0.1%

gbifRegion
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing32195
Missing (%)7.1%
Memory size3.5 MiB
2025-01-08T17:57:34.600363image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length9.506102592
Min length4

Characters and Unicode

Total characters4021243
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowASIA
4th rowNORTH_AMERICA
5th rowLATIN_AMERICA
ValueCountFrequency (%)
north_america 127654
30.2%
latin_america 100745
23.8%
asia 94416
22.3%
oceania 68048
16.1%
africa 24873
 
5.9%
europe 5998
 
1.4%
antarctica 1283
 
0.3%
2025-01-08T17:57:34.814093image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 936066
23.3%
I 517764
12.9%
R 388207
9.7%
C 323886
 
8.1%
E 308443
 
7.7%
N 297730
 
7.4%
T 230965
 
5.7%
_ 228399
 
5.7%
M 228399
 
5.7%
O 201700
 
5.0%
Other values (6) 359684
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3792844
94.3%
Connector Punctuation 228399
 
5.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 936066
24.7%
I 517764
13.7%
R 388207
10.2%
C 323886
 
8.5%
E 308443
 
8.1%
N 297730
 
7.8%
T 230965
 
6.1%
M 228399
 
6.0%
O 201700
 
5.3%
H 127654
 
3.4%
Other values (5) 232030
 
6.1%
Connector Punctuation
ValueCountFrequency (%)
_ 228399
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3792844
94.3%
Common 228399
 
5.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 936066
24.7%
I 517764
13.7%
R 388207
10.2%
C 323886
 
8.5%
E 308443
 
8.1%
N 297730
 
7.8%
T 230965
 
6.1%
M 228399
 
6.0%
O 201700
 
5.3%
H 127654
 
3.4%
Other values (5) 232030
 
6.1%
Common
ValueCountFrequency (%)
_ 228399
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4021243
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 936066
23.3%
I 517764
12.9%
R 388207
9.7%
C 323886
 
8.1%
E 308443
 
7.7%
N 297730
 
7.4%
T 230965
 
5.7%
_ 228399
 
5.7%
M 228399
 
5.7%
O 201700
 
5.0%
Other values (6) 359684
 
8.9%

publishedByGbifRegion
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-08T17:57:34.858089image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters5917665
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 455205
100.0%
2025-01-08T17:57:34.947383image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 910410
15.4%
A 910410
15.4%
N 455205
7.7%
O 455205
7.7%
T 455205
7.7%
H 455205
7.7%
_ 455205
7.7%
M 455205
7.7%
E 455205
7.7%
I 455205
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5462460
92.3%
Connector Punctuation 455205
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 910410
16.7%
A 910410
16.7%
N 455205
8.3%
O 455205
8.3%
T 455205
8.3%
H 455205
8.3%
M 455205
8.3%
E 455205
8.3%
I 455205
8.3%
C 455205
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 455205
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5462460
92.3%
Common 455205
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 910410
16.7%
A 910410
16.7%
N 455205
8.3%
O 455205
8.3%
T 455205
8.3%
H 455205
8.3%
M 455205
8.3%
E 455205
8.3%
I 455205
8.3%
C 455205
8.3%
Common
ValueCountFrequency (%)
_ 455205
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5917665
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 910410
15.4%
A 910410
15.4%
N 455205
7.7%
O 455205
7.7%
T 455205
7.7%
H 455205
7.7%
_ 455205
7.7%
M 455205
7.7%
E 455205
7.7%
I 455205
7.7%

level0Gid
Text

Missing 

Distinct139
Distinct (%)0.3%
Missing407295
Missing (%)89.5%
Memory size3.5 MiB
2025-01-08T17:57:35.057283image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters143751
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)< 0.1%

Sample

1st rowTON
2nd rowPHL
3rd rowBRA
4th rowPAN
5th rowIDN
ValueCountFrequency (%)
usa 11507
24.0%
phl 5147
 
10.7%
ven 2888
 
6.0%
bra 2826
 
5.9%
idn 2406
 
5.0%
fji 2133
 
4.5%
sur 1264
 
2.6%
per 1237
 
2.6%
png 1100
 
2.3%
slb 1025
 
2.1%
Other values (129) 16384
34.2%
2025-01-08T17:57:35.216008image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 17643
12.3%
S 16829
11.7%
U 15233
 
10.6%
N 10380
 
7.2%
P 9498
 
6.6%
L 8389
 
5.8%
R 7899
 
5.5%
H 6313
 
4.4%
I 5803
 
4.0%
E 5461
 
3.8%
Other values (16) 40303
28.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 143751
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 17643
12.3%
S 16829
11.7%
U 15233
 
10.6%
N 10380
 
7.2%
P 9498
 
6.6%
L 8389
 
5.8%
R 7899
 
5.5%
H 6313
 
4.4%
I 5803
 
4.0%
E 5461
 
3.8%
Other values (16) 40303
28.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 143751
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 17643
12.3%
S 16829
11.7%
U 15233
 
10.6%
N 10380
 
7.2%
P 9498
 
6.6%
L 8389
 
5.8%
R 7899
 
5.5%
H 6313
 
4.4%
I 5803
 
4.0%
E 5461
 
3.8%
Other values (16) 40303
28.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 143751
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 17643
12.3%
S 16829
11.7%
U 15233
 
10.6%
N 10380
 
7.2%
P 9498
 
6.6%
L 8389
 
5.8%
R 7899
 
5.5%
H 6313
 
4.4%
I 5803
 
4.0%
E 5461
 
3.8%
Other values (16) 40303
28.0%

level0Name
Text

Missing 

Distinct139
Distinct (%)0.3%
Missing407295
Missing (%)89.5%
Memory size3.5 MiB
2025-01-08T17:57:35.353695image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length24
Mean length10.19049607
Min length4

Characters and Unicode

Total characters488298
Distinct characters61
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)< 0.1%

Sample

1st rowTonga
2nd rowPhilippines
3rd rowBrazil
4th rowPanama
5th rowIndonesia
ValueCountFrequency (%)
united 11597
16.7%
states 11597
16.7%
philippines 5147
 
7.4%
venezuela 2888
 
4.2%
brazil 2826
 
4.1%
indonesia 2406
 
3.5%
fiji 2133
 
3.1%
new 1388
 
2.0%
and 1366
 
2.0%
islands 1352
 
1.9%
Other values (173) 26850
38.6%
2025-01-08T17:57:35.557735image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 53763
 
11.0%
i 52703
 
10.8%
a 51807
 
10.6%
n 41316
 
8.5%
t 38500
 
7.9%
s 27341
 
5.6%
21633
 
4.4%
d 20330
 
4.2%
l 20222
 
4.1%
S 15363
 
3.1%
Other values (51) 145320
29.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 398596
81.6%
Uppercase Letter 68018
 
13.9%
Space Separator 21633
 
4.4%
Other Punctuation 40
 
< 0.1%
Open Punctuation 5
 
< 0.1%
Close Punctuation 5
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 53763
13.5%
i 52703
13.2%
a 51807
13.0%
n 41316
10.4%
t 38500
9.7%
s 27341
6.9%
d 20330
 
5.1%
l 20222
 
5.1%
o 15006
 
3.8%
r 14531
 
3.6%
Other values (21) 63077
15.8%
Uppercase Letter
ValueCountFrequency (%)
S 15363
22.6%
U 11602
17.1%
P 9290
13.7%
B 4522
 
6.6%
I 4345
 
6.4%
T 3836
 
5.6%
V 3518
 
5.2%
C 3155
 
4.6%
F 3136
 
4.6%
M 2238
 
3.3%
Other values (14) 7013
10.3%
Other Punctuation
ValueCountFrequency (%)
, 38
95.0%
' 2
 
5.0%
Space Separator
ValueCountFrequency (%)
21633
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 466614
95.6%
Common 21684
 
4.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 53763
11.5%
i 52703
11.3%
a 51807
 
11.1%
n 41316
 
8.9%
t 38500
 
8.3%
s 27341
 
5.9%
d 20330
 
4.4%
l 20222
 
4.3%
S 15363
 
3.3%
o 15006
 
3.2%
Other values (45) 130263
27.9%
Common
ValueCountFrequency (%)
21633
99.8%
, 38
 
0.2%
( 5
 
< 0.1%
) 5
 
< 0.1%
' 2
 
< 0.1%
- 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 487641
99.9%
None 657
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 53763
 
11.0%
i 52703
 
10.8%
a 51807
 
10.6%
n 41316
 
8.5%
t 38500
 
7.9%
s 27341
 
5.6%
21633
 
4.4%
d 20330
 
4.2%
l 20222
 
4.1%
S 15363
 
3.2%
Other values (46) 144663
29.7%
None
ValueCountFrequency (%)
ç 503
76.6%
é 150
 
22.8%
ô 2
 
0.3%
ã 1
 
0.2%
í 1
 
0.2%

level1Gid
Text

Missing 

Distinct629
Distinct (%)1.3%
Missing408402
Missing (%)89.7%
Memory size3.5 MiB
2025-01-08T17:57:35.746304image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.589168981
Min length6

Characters and Unicode

Total characters355249
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique134 ?
Unique (%)0.3%

Sample

1st rowTON.5_1
2nd rowPHL.52_1
3rd rowBRA.13_1
4th rowPAN.5_1
5th rowIDN.12_1
ValueCountFrequency (%)
usa.47_1 2114
 
4.5%
usa.39_1 1909
 
4.1%
usa.21_1 1178
 
2.5%
fji.4_1 1089
 
2.3%
phl.52_1 1010
 
2.2%
sur.9_1 986
 
2.1%
bra.4_1 986
 
2.1%
usa.49_1 966
 
2.1%
fji.2_1 918
 
2.0%
idn.19_1 915
 
2.0%
Other values (619) 34739
74.2%
2025-01-08T17:57:35.994259image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 61632
17.3%
_ 46806
13.2%
. 46779
13.2%
A 17604
 
5.0%
S 16787
 
4.7%
U 14730
 
4.1%
2 12539
 
3.5%
4 10903
 
3.1%
N 10380
 
2.9%
3 9530
 
2.7%
Other values (28) 107559
30.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 140442
39.5%
Decimal Number 121222
34.1%
Connector Punctuation 46806
 
13.2%
Other Punctuation 46779
 
13.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 17604
12.5%
S 16787
12.0%
U 14730
 
10.5%
N 10380
 
7.4%
P 9498
 
6.8%
L 8367
 
6.0%
R 7692
 
5.5%
H 6317
 
4.5%
E 5461
 
3.9%
B 5444
 
3.9%
Other values (16) 38162
27.2%
Decimal Number
ValueCountFrequency (%)
1 61632
50.8%
2 12539
 
10.3%
4 10903
 
9.0%
3 9530
 
7.9%
9 8556
 
7.1%
5 5584
 
4.6%
7 5302
 
4.4%
6 2955
 
2.4%
8 2789
 
2.3%
0 1432
 
1.2%
Connector Punctuation
ValueCountFrequency (%)
_ 46806
100.0%
Other Punctuation
ValueCountFrequency (%)
. 46779
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 214807
60.5%
Latin 140442
39.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 17604
12.5%
S 16787
12.0%
U 14730
 
10.5%
N 10380
 
7.4%
P 9498
 
6.8%
L 8367
 
6.0%
R 7692
 
5.5%
H 6317
 
4.5%
E 5461
 
3.9%
B 5444
 
3.9%
Other values (16) 38162
27.2%
Common
ValueCountFrequency (%)
1 61632
28.7%
_ 46806
21.8%
. 46779
21.8%
2 12539
 
5.8%
4 10903
 
5.1%
3 9530
 
4.4%
9 8556
 
4.0%
5 5584
 
2.6%
7 5302
 
2.5%
6 2955
 
1.4%
Other values (2) 4221
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 355249
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 61632
17.3%
_ 46806
13.2%
. 46779
13.2%
A 17604
 
5.0%
S 16787
 
4.7%
U 14730
 
4.1%
2 12539
 
3.5%
4 10903
 
3.1%
N 10380
 
2.9%
3 9530
 
2.7%
Other values (28) 107559
30.3%

level1Name
Text

Missing 

Distinct611
Distinct (%)1.3%
Missing408402
Missing (%)89.7%
Memory size3.5 MiB
2025-01-08T17:57:36.162203image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length30
Median length24
Mean length9.236039308
Min length3

Characters and Unicode

Total characters432339
Distinct characters82
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique130 ?
Unique (%)0.3%

Sample

1st rowVava'u
2nd rowNegros Oriental
3rd rowMinas Gerais
4th rowDarién
5th rowKalimantan Barat
ValueCountFrequency (%)
virginia 3080
 
4.8%
pennsylvania 1909
 
3.0%
amazonas 1611
 
2.5%
maryland 1178
 
1.8%
south 1090
 
1.7%
rotuma 1089
 
1.7%
islands 1083
 
1.7%
oriental 1050
 
1.6%
negros 1025
 
1.6%
eastern 1020
 
1.6%
Other values (674) 49891
77.9%
2025-01-08T17:57:36.381503image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 64833
15.0%
n 33919
 
7.8%
i 32416
 
7.5%
r 26649
 
6.2%
o 26487
 
6.1%
e 26064
 
6.0%
s 19671
 
4.5%
t 19496
 
4.5%
l 19259
 
4.5%
17216
 
4.0%
Other values (72) 146329
33.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 350132
81.0%
Uppercase Letter 63320
 
14.6%
Space Separator 17216
 
4.0%
Dash Punctuation 1031
 
0.2%
Other Punctuation 564
 
0.1%
Modifier Symbol 48
 
< 0.1%
Open Punctuation 14
 
< 0.1%
Close Punctuation 14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 64833
18.5%
n 33919
9.7%
i 32416
9.3%
r 26649
7.6%
o 26487
 
7.6%
e 26064
 
7.4%
s 19671
 
5.6%
t 19496
 
5.6%
l 19259
 
5.5%
u 14915
 
4.3%
Other values (34) 66423
19.0%
Uppercase Letter
ValueCountFrequency (%)
S 7197
 
11.4%
M 5791
 
9.1%
A 5062
 
8.0%
B 4023
 
6.4%
V 3938
 
6.2%
T 3895
 
6.2%
P 3802
 
6.0%
N 3754
 
5.9%
C 3733
 
5.9%
O 3489
 
5.5%
Other values (19) 18636
29.4%
Other Punctuation
ValueCountFrequency (%)
' 552
97.9%
/ 12
 
2.1%
Open Punctuation
ValueCountFrequency (%)
[ 11
78.6%
( 3
 
21.4%
Close Punctuation
ValueCountFrequency (%)
] 11
78.6%
) 3
 
21.4%
Space Separator
ValueCountFrequency (%)
17216
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1031
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 48
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 413452
95.6%
Common 18887
 
4.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 64833
15.7%
n 33919
 
8.2%
i 32416
 
7.8%
r 26649
 
6.4%
o 26487
 
6.4%
e 26064
 
6.3%
s 19671
 
4.8%
t 19496
 
4.7%
l 19259
 
4.7%
u 14915
 
3.6%
Other values (63) 129743
31.4%
Common
ValueCountFrequency (%)
17216
91.2%
- 1031
 
5.5%
' 552
 
2.9%
` 48
 
0.3%
/ 12
 
0.1%
[ 11
 
0.1%
] 11
 
0.1%
( 3
 
< 0.1%
) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 427954
99.0%
None 4316
 
1.0%
Latin Ext Additional 69
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 64833
15.1%
n 33919
 
7.9%
i 32416
 
7.6%
r 26649
 
6.2%
o 26487
 
6.2%
e 26064
 
6.1%
s 19671
 
4.6%
t 19496
 
4.6%
l 19259
 
4.5%
17216
 
4.0%
Other values (51) 141944
33.2%
None
ValueCountFrequency (%)
á 1512
35.0%
Î 928
21.5%
é 860
19.9%
í 392
 
9.1%
ó 256
 
5.9%
ã 136
 
3.2%
ò 73
 
1.7%
ì 68
 
1.6%
ö 33
 
0.8%
ț 14
 
0.3%
Other values (9) 44
 
1.0%
Latin Ext Additional
ValueCountFrequency (%)
68
98.6%
1
 
1.4%

level2Gid
Text

Missing 

Distinct1834
Distinct (%)4.2%
Missing412023
Missing (%)90.5%
Memory size3.5 MiB
2025-01-08T17:57:36.573496image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length10.0572368
Min length7

Characters and Unicode

Total characters434362
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique435 ?
Unique (%)1.0%

Sample

1st rowTON.5.0_1
2nd rowPHL.52.17_1
3rd rowBRA.13.511_2
4th rowPAN.5.2_1
5th rowIDN.12.14_1
ValueCountFrequency (%)
fji.4.1_1 1089
 
2.5%
sur.9.5_1 722
 
1.7%
fji.2.2_1 640
 
1.5%
slb.7.26_1 534
 
1.2%
ton.5.0_1 471
 
1.1%
idn.19.1_1 469
 
1.1%
ven.9.1_1 455
 
1.1%
per.17.4_1 448
 
1.0%
idn.19.6_1 444
 
1.0%
idn.28.2_1 443
 
1.0%
Other values (1824) 37474
86.8%
2025-01-08T17:57:36.820129image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 86343
19.9%
1 68426
15.8%
_ 43189
 
9.9%
2 25854
 
6.0%
A 17316
 
4.0%
4 16359
 
3.8%
3 15943
 
3.7%
S 15249
 
3.5%
U 14430
 
3.3%
5 11495
 
2.6%
Other values (28) 119758
27.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 175263
40.3%
Uppercase Letter 129567
29.8%
Other Punctuation 86343
19.9%
Connector Punctuation 43189
 
9.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 17316
13.4%
S 15249
11.8%
U 14430
11.1%
N 10350
 
8.0%
P 8606
 
6.6%
L 8302
 
6.4%
R 7644
 
5.9%
H 5908
 
4.6%
E 5459
 
4.2%
I 5168
 
4.0%
Other values (16) 31135
24.0%
Decimal Number
ValueCountFrequency (%)
1 68426
39.0%
2 25854
 
14.8%
4 16359
 
9.3%
3 15943
 
9.1%
5 11495
 
6.6%
9 11103
 
6.3%
7 8897
 
5.1%
6 7967
 
4.5%
8 5561
 
3.2%
0 3658
 
2.1%
Other Punctuation
ValueCountFrequency (%)
. 86343
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 43189
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 304795
70.2%
Latin 129567
29.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 17316
13.4%
S 15249
11.8%
U 14430
11.1%
N 10350
 
8.0%
P 8606
 
6.6%
L 8302
 
6.4%
R 7644
 
5.9%
H 5908
 
4.6%
E 5459
 
4.2%
I 5168
 
4.0%
Other values (16) 31135
24.0%
Common
ValueCountFrequency (%)
. 86343
28.3%
1 68426
22.4%
_ 43189
14.2%
2 25854
 
8.5%
4 16359
 
5.4%
3 15943
 
5.2%
5 11495
 
3.8%
9 11103
 
3.6%
7 8897
 
2.9%
6 7967
 
2.6%
Other values (2) 9219
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 434362
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 86343
19.9%
1 68426
15.8%
_ 43189
 
9.9%
2 25854
 
6.0%
A 17316
 
4.0%
4 16359
 
3.8%
3 15943
 
3.7%
S 15249
 
3.5%
U 14430
 
3.3%
5 11495
 
2.6%
Other values (28) 119758
27.6%

level2Name
Text

Missing 

Distinct1707
Distinct (%)4.0%
Missing412026
Missing (%)90.5%
Memory size3.5 MiB
2025-01-08T17:57:36.999749image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length8.320659473
Min length3

Characters and Unicode

Total characters359336
Distinct characters88
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique409 ?
Unique (%)0.9%

Sample

1st rown.a.
2nd rowSan Jose
3rd rowNanuque
4th rowPinogana
5th rowSintang
ValueCountFrequency (%)
city 2133
 
3.8%
rotuma 1089
 
1.9%
kabalebo 722
 
1.3%
san 695
 
1.2%
lau 640
 
1.1%
n.a 549
 
1.0%
sikaiana 534
 
1.0%
tengah 518
 
0.9%
antonio 476
 
0.8%
ambon 469
 
0.8%
Other values (1915) 48185
86.0%
2025-01-08T17:57:37.232624image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 53679
 
14.9%
n 26896
 
7.5%
o 25683
 
7.1%
e 21825
 
6.1%
i 21332
 
5.9%
u 16906
 
4.7%
r 16255
 
4.5%
t 15400
 
4.3%
l 14249
 
4.0%
12824
 
3.6%
Other values (78) 134287
37.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 287322
80.0%
Uppercase Letter 55894
 
15.6%
Space Separator 12824
 
3.6%
Other Punctuation 1664
 
0.5%
Dash Punctuation 1383
 
0.4%
Decimal Number 223
 
0.1%
Open Punctuation 15
 
< 0.1%
Close Punctuation 10
 
< 0.1%
Initial Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 53679
18.7%
n 26896
9.4%
o 25683
 
8.9%
e 21825
 
7.6%
i 21332
 
7.4%
u 16906
 
5.9%
r 16255
 
5.7%
t 15400
 
5.4%
l 14249
 
5.0%
s 9165
 
3.2%
Other values (36) 65932
22.9%
Uppercase Letter
ValueCountFrequency (%)
C 6228
 
11.1%
M 5541
 
9.9%
S 4806
 
8.6%
B 4081
 
7.3%
P 3518
 
6.3%
A 3325
 
5.9%
N 3098
 
5.5%
T 2991
 
5.4%
K 2898
 
5.2%
L 2826
 
5.1%
Other values (19) 16582
29.7%
Other Punctuation
ValueCountFrequency (%)
. 1239
74.5%
' 412
 
24.8%
# 11
 
0.7%
/ 2
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 152
68.2%
0 53
 
23.8%
7 13
 
5.8%
8 5
 
2.2%
Space Separator
ValueCountFrequency (%)
12824
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1383
100.0%
Open Punctuation
ValueCountFrequency (%)
( 15
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 343216
95.5%
Common 16120
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 53679
15.6%
n 26896
 
7.8%
o 25683
 
7.5%
e 21825
 
6.4%
i 21332
 
6.2%
u 16906
 
4.9%
r 16255
 
4.7%
t 15400
 
4.5%
l 14249
 
4.2%
s 9165
 
2.7%
Other values (65) 121826
35.5%
Common
ValueCountFrequency (%)
12824
79.6%
- 1383
 
8.6%
. 1239
 
7.7%
' 412
 
2.6%
1 152
 
0.9%
0 53
 
0.3%
( 15
 
0.1%
7 13
 
0.1%
# 11
 
0.1%
) 10
 
0.1%
Other values (3) 8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 356398
99.2%
None 2869
 
0.8%
Latin Ext Additional 68
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 53679
15.1%
n 26896
 
7.5%
o 25683
 
7.2%
e 21825
 
6.1%
i 21332
 
6.0%
u 16906
 
4.7%
r 16255
 
4.6%
t 15400
 
4.3%
l 14249
 
4.0%
12824
 
3.6%
Other values (54) 131349
36.9%
None
ValueCountFrequency (%)
í 750
26.1%
á 637
22.2%
é 514
17.9%
ã 232
 
8.1%
ó 225
 
7.8%
ñ 217
 
7.6%
ú 117
 
4.1%
ç 79
 
2.8%
ô 33
 
1.2%
Ó 23
 
0.8%
Other values (12) 42
 
1.5%
Latin Ext Additional
ValueCountFrequency (%)
ế 68
100.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

level3Gid
Text

Missing 

Distinct763
Distinct (%)5.5%
Missing441377
Missing (%)97.0%
Memory size3.5 MiB
2025-01-08T17:57:37.434155image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length14
Mean length12.36754608
Min length11

Characters and Unicode

Total characters171105
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique214 ?
Unique (%)1.5%

Sample

1st rowPHL.52.17.11_1
2nd rowPAN.5.2.4_1
3rd rowIDN.12.14.12_1
4th rowPHL.69.7.31_1
5th rowCMR.9.6.8_1
ValueCountFrequency (%)
idn.28.2.4_1 443
 
3.2%
bol.3.8.2_2 442
 
3.2%
per.18.3.4_1 329
 
2.4%
per.17.4.4_1 312
 
2.3%
idn.19.1.3_1 266
 
1.9%
cmr.9.6.8_1 253
 
1.8%
cmr.9.4.2_1 216
 
1.6%
phl.36.37.65_1 201
 
1.5%
phl.52.25.3_1 191
 
1.4%
phl.52.17.11_1 187
 
1.4%
Other values (753) 10995
79.5%
2025-01-08T17:57:37.689242image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 41505
24.3%
1 27915
16.3%
_ 13835
 
8.1%
2 11592
 
6.8%
P 7360
 
4.3%
5 6289
 
3.7%
L 6013
 
3.5%
H 5824
 
3.4%
3 5449
 
3.2%
4 5221
 
3.1%
Other values (24) 40102
23.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 74260
43.4%
Other Punctuation 41505
24.3%
Uppercase Letter 41505
24.3%
Connector Punctuation 13835
 
8.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 7360
17.7%
L 6013
14.5%
H 5824
14.0%
N 3826
9.2%
R 2786
 
6.7%
I 2610
 
6.3%
D 2583
 
6.2%
M 2553
 
6.2%
A 1776
 
4.3%
E 1711
 
4.1%
Other values (12) 4463
10.8%
Decimal Number
ValueCountFrequency (%)
1 27915
37.6%
2 11592
15.6%
5 6289
 
8.5%
3 5449
 
7.3%
4 5221
 
7.0%
6 4843
 
6.5%
9 4402
 
5.9%
7 3749
 
5.0%
8 3548
 
4.8%
0 1252
 
1.7%
Other Punctuation
ValueCountFrequency (%)
. 41505
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 13835
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 129600
75.7%
Latin 41505
 
24.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 7360
17.7%
L 6013
14.5%
H 5824
14.0%
N 3826
9.2%
R 2786
 
6.7%
I 2610
 
6.3%
D 2583
 
6.2%
M 2553
 
6.2%
A 1776
 
4.3%
E 1711
 
4.1%
Other values (12) 4463
10.8%
Common
ValueCountFrequency (%)
. 41505
32.0%
1 27915
21.5%
_ 13835
 
10.7%
2 11592
 
8.9%
5 6289
 
4.9%
3 5449
 
4.2%
4 5221
 
4.0%
6 4843
 
3.7%
9 4402
 
3.4%
7 3749
 
2.9%
Other values (2) 4800
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 171105
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 41505
24.3%
1 27915
16.3%
_ 13835
 
8.1%
2 11592
 
6.8%
P 7360
 
4.3%
5 6289
 
3.7%
L 6013
 
3.5%
H 5824
 
3.4%
3 5449
 
3.2%
4 5221
 
3.1%
Other values (24) 40102
23.4%

level3Name
Text

Missing 

Distinct731
Distinct (%)5.3%
Missing441442
Missing (%)97.0%
Memory size3.5 MiB
2025-01-08T17:57:37.865700image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length25
Mean length9.582207698
Min length3

Characters and Unicode

Total characters131947
Distinct characters91
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique211 ?
Unique (%)1.5%

Sample

1st rowSeñora Ascion
2nd rowMetetí
3rd rowSintang
4th rowPinontingan
5th rowMundemba
ValueCountFrequency (%)
santa 988
 
4.6%
poblacion 541
 
2.5%
ana 515
 
2.4%
timur 501
 
2.4%
kabaena 443
 
2.1%
san 351
 
1.7%
tambopata 329
 
1.5%
iquitos 312
 
1.5%
barangay 304
 
1.4%
de 304
 
1.4%
Other values (882) 16663
78.4%
2025-01-08T17:57:38.104867image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 22534
17.1%
n 11149
 
8.4%
o 9569
 
7.3%
7481
 
5.7%
i 6726
 
5.1%
u 6243
 
4.7%
r 5659
 
4.3%
e 5013
 
3.8%
t 4320
 
3.3%
l 3979
 
3.0%
Other values (81) 49274
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 101297
76.8%
Uppercase Letter 20490
 
15.5%
Space Separator 7481
 
5.7%
Decimal Number 1246
 
0.9%
Other Punctuation 733
 
0.6%
Open Punctuation 279
 
0.2%
Close Punctuation 272
 
0.2%
Dash Punctuation 149
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 22534
22.2%
n 11149
11.0%
o 9569
9.4%
i 6726
 
6.6%
u 6243
 
6.2%
r 5659
 
5.6%
e 5013
 
4.9%
t 4320
 
4.3%
l 3979
 
3.9%
b 3282
 
3.2%
Other values (35) 22823
22.5%
Uppercase Letter
ValueCountFrequency (%)
S 2880
14.1%
P 1817
 
8.9%
T 1788
 
8.7%
M 1587
 
7.7%
A 1506
 
7.3%
K 1428
 
7.0%
B 1222
 
6.0%
N 1175
 
5.7%
I 1107
 
5.4%
C 1072
 
5.2%
Other values (17) 4908
24.0%
Decimal Number
ValueCountFrequency (%)
6 334
26.8%
2 324
26.0%
1 277
22.2%
9 119
 
9.6%
8 65
 
5.2%
3 47
 
3.8%
0 42
 
3.4%
5 22
 
1.8%
7 14
 
1.1%
4 2
 
0.2%
Other Punctuation
ValueCountFrequency (%)
. 577
78.7%
, 141
 
19.2%
" 8
 
1.1%
' 6
 
0.8%
/ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
7481
100.0%
Open Punctuation
ValueCountFrequency (%)
( 279
100.0%
Close Punctuation
ValueCountFrequency (%)
) 272
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 149
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 121787
92.3%
Common 10160
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 22534
18.5%
n 11149
 
9.2%
o 9569
 
7.9%
i 6726
 
5.5%
u 6243
 
5.1%
r 5659
 
4.6%
e 5013
 
4.1%
t 4320
 
3.5%
l 3979
 
3.3%
b 3282
 
2.7%
Other values (62) 43313
35.6%
Common
ValueCountFrequency (%)
7481
73.6%
. 577
 
5.7%
6 334
 
3.3%
2 324
 
3.2%
( 279
 
2.7%
1 277
 
2.7%
) 272
 
2.7%
- 149
 
1.5%
, 141
 
1.4%
9 119
 
1.2%
Other values (9) 207
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 130475
98.9%
None 1373
 
1.0%
Latin Ext Additional 99
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 22534
17.3%
n 11149
 
8.5%
o 9569
 
7.3%
7481
 
5.7%
i 6726
 
5.2%
u 6243
 
4.8%
r 5659
 
4.3%
e 5013
 
3.8%
t 4320
 
3.3%
l 3979
 
3.0%
Other values (61) 47802
36.6%
None
ValueCountFrequency (%)
í 518
37.7%
ñ 260
18.9%
á 230
16.8%
é 94
 
6.8%
ĩ 63
 
4.6%
ư 57
 
4.2%
ũ 54
 
3.9%
ú 40
 
2.9%
ó 39
 
2.8%
Đ 14
 
1.0%
Other values (4) 4
 
0.3%
Latin Ext Additional
ValueCountFrequency (%)
41
41.4%
16
 
16.2%
14
 
14.1%
14
 
14.1%
13
 
13.1%
1
 
1.0%

iucnRedListCategory
Text

Missing 

Distinct9
Distinct (%)< 0.1%
Missing11501
Missing (%)2.5%
Memory size3.5 MiB
2025-01-08T17:57:38.157345image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters887422
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLC
2nd rowNE
3rd rowLC
4th rowLC
5th rowLC
ValueCountFrequency (%)
lc 278407
62.7%
ne 139325
31.4%
dd 10088
 
2.3%
vu 7110
 
1.6%
nt 4625
 
1.0%
en 2883
 
0.6%
cr 1136
 
0.3%
ex 119
 
< 0.1%
ew 18
 
< 0.1%
2025-01-08T17:57:38.245878image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 279543
31.5%
L 278407
31.4%
N 146833
16.5%
E 142345
16.0%
D 20176
 
2.3%
V 7110
 
0.8%
U 7110
 
0.8%
T 4625
 
0.5%
R 1136
 
0.1%
X 119
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 887422
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 279543
31.5%
L 278407
31.4%
N 146833
16.5%
E 142345
16.0%
D 20176
 
2.3%
V 7110
 
0.8%
U 7110
 
0.8%
T 4625
 
0.5%
R 1136
 
0.1%
X 119
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 887422
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 279543
31.5%
L 278407
31.4%
N 146833
16.5%
E 142345
16.0%
D 20176
 
2.3%
V 7110
 
0.8%
U 7110
 
0.8%
T 4625
 
0.5%
R 1136
 
0.1%
X 119
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 887422
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 279543
31.5%
L 278407
31.4%
N 146833
16.5%
E 142345
16.0%
D 20176
 
2.3%
V 7110
 
0.8%
U 7110
 
0.8%
T 4625
 
0.5%
R 1136
 
0.1%
X 119
 
< 0.1%